Document 10678677

advertisement
Domestic Migration Networks in the United States
by
Robert Allen Manduca
B.A., Swarthmore College (2010)
Submitted to the Department of Urban Studies and Planning
in partial fulfillment of the requirements for the degree of
Master in City Planning
A:A() HUSEZTTS
at the
\ASSACHUSETTS INSTITUTE OF TECHNOLOGY
JUN 19 2014
IBRARIES
June 2014
@ Robert Allen Manduca. MMXIV. All rights reserved.
The author hereby grants to MIT permission to reproduce and to
distribute publicly paper and electronic copies of this thesis document
in whole or in part in any medium now known or hereafter created.
Signature
redacted
Author .
Department of Urban Studies and Planning
May 22. 2014
201
CySignature
~\a
redacted
Albert Saiz
Associate Professor
Ahesis Supervisor
Accepted by ..............
INSTITUTE
ihc."*4OLOGY
Signature redacted
-P. Viristopher Zegras
Associate Professor
Chair, MCP Committee
2
Domestic Migration Networks in the United States
by
Robert Allen Mkanduca
Submitted to the Department of Urbain Studies and Planning
on May 22, 2014, in partial fulfillment of the
requirenents for the degree of
Master in City Planning
Abstract
In recent years, there has been substantial interest in understanding urban systems
at the national and global scales: what are the economic and social ties that link
cities together, and what is the netw o)rk structure formed by such ties? At the saime
time, human capital accumulation is increasingly seen as a, primary driver of regional
ec()nomic growth. Doimnestic migration patterns have the potential to illuminate the
social and economlic connections among cities, while also highlighting economi( ally
significant fows of111human capital.
In this thesis I examine the US city system through the lens of gross migrat ion
flows, taking advantage of unusually complete data on county-to-county migration
cornpiled annually by the IRS. I compare the observed flows to those predicted]by the
raliation model. finling most notably that there are far more lo ng-distance migrants
than would be predicted based on the spatial (listribution of population alone. I
then use reciprocal migration patterns to construct a migration network connecting
metro areas in the United States. I utilize current-flow centrality measures to identifv
the most prolinent nodes in this wvveighted network. Additionally, I use repeated
applications of the Louvain community detection algoritmin to identify reasonably
robust communities within the migration network. These exhibit a striking degree of
spatial contiguity.
Thesis Supervisor: Albert Saiz
Title: Associate Professor
4
Acknowledgments
The seeds of tis thesis wvere planted more than flour Years ago, when I first encointered the bloggings of Aaro)n Renn. At the t ime, the idea of st udying these 4opics in
earnest -at MIT no less--seeied fanciful.
After two years here, all I can say is that iny experience has exceeded all possible
expect at ions. A number of people have contributed to making that the case. Foremost
among these are my classmates: I wouldn't have thought it possible to feel so close
to so many people I find so impressive.
'A" advisor, Albert Saiz, has been a consistent source of guidance on everything
from the specifics of this thesis to what I should do with mv life. He was invaluable
in helping roe to craft a curriculum in quaintitative social science at D SP.
Mv thesis reader, Mart a Gonzalez, has been a tremendous source of enthusiasm
and met hodological insight 4throughout this process. I am grateful to her for being so
willing to embra(ce a stlludent from urban planning.
Xavier de Souza Briggs and Joe Ferreira have hoth been foundationail to my experience at DUSI and (onversat ions with them have criticallv informed my academic
and c'areer trajector .
A number of pr)fessors across the Inst itute--Am Glasmeier, PIhil Cla y. Phil
Thompson, Karl Seidman, Cesar Hidalgo, Fiona Murrav. and Eran Ben-Joseph,
among others--have opened my mind and profoundly shaped my worldview. I was
expecting MIT professors to be brilliant, but I was not expecting them to be such
outstanding teachers.
A big shout out to all of the staff at DUSP. Ezra Glenn has enriched my experience immensely thbrough his boundless creativity, energ, and wisdon. Kirsten Greco
kept me on track to graduate by\ cheerfully answering countless inquiries about d1partmental minutiae, and Janine Marchese was admirably sanguine about processing
a similar number of reimbursement request s.
My part ner Roseanna cont inues to challenge me it ellectuailv, guide me ethically,
and support me emotionally. This probably wouldn't have beenl possible without her.
Finally, my family. ly sister Katie remains a ('onstant source of cheer in my life,
and a model of'how to face difficult situations witi courage and good humor. As for
my mot her and father: the further I go in life the more I realize Just how much they
have given ine. A son coukd not ask to be raised by better parents.
5
6
Contents
1
Introduction
2
Motivating Literature
11
2.1
Investigations of Migration . . . . . . . . . . . . . . . . . . . . . . . .
2.1
i )eterminants of Migration . . . . . . . . . . . . . . . . . . . .
11
11
2.1.2
(ffosequenees
Migrat
. . . . . . . . . . . . . . .
. . . .
HIu-liran Capita, So( ial Networks, aid E-onomi-ifl Development . . . .
2.2.1
Human C( pital and Econom (Groth . . . . . . . . . . . . .
2.2.2
The lim)por1taCUE e of Social Networks in Cross-Regional Eeoiiomi
13
2.2
2.8
2.4
3
4
5
1-4
14
A v it
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
Mapping and Analy zin tin Ur ban Sysem . . . . . . . . . . . . . . .15
Themes and i)ireetion....
. . . . . . . . . . . ..
. . . . . . . ..
16
17
Data Source and Preparation
..........
17
...................
3.1
Data Soure..
3.2
D ata Prepar ion'r . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
19
8.8
3.2.1
Metropolitan Areas . . . . . . . . . . . . . . . . . . . . . . . .
Net, Gross, and Reciproeal ligratin . . . . . . . . . . . . . . . . . .
19
19
Observed and Modeled Migration Patterns
1.1 Metropolitan Migration Rates . . . . . . . . . . . . . . . . . . . . . .
23
23
4.2
Inidividual Migrat ion Flows . . . . . . . .
. . . . . . . .
26
4.3
4.4
The Radialiori Model . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1
Impleientation of the Radialion Model . . . . . . . . . . . . .
Radiation Model Results . . . . . . . . . . . . . . . . . . . . . . . . .
27
29
30
4.5
Patterns of Resi(ials . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
Centrality Analaysis of the Migration Network
5.1 Degree Centrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
37
Closeness Centrality . . . . . . . . . . . . . . . . . . . . . . . . . . .
Betweenness Ceritiality . . . . . . . . . . . . . . . . . . . . . . . . . .
12
Detecting Communities in the Migration Network
6.1 Approach to Cormunity )eteetion . . . . . . . . . . . . . . . . . . .
45
45
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
5.2
5.3
6
9
6.1.1
M odularity
7
. . . . . .
39
6.1.2 The Louvairi Algorithm . . . .
6.1.3 Initial Partitions . . . . . . .
6.1.4 Repeated Louvain Runs . . .
Community Roles of In(livi(lual Metro
6.2.1 Extra-Community Degree . .
6.2.2 Cornmunity Diversity . . . . .
6.2.3 Within-Community Role . . .
. . . .
. . . .
. . . .
Areas
. . . .
. . . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
46
46
48
51
51
52
55
7 Discussion, Limitations, and Further Research
7.1 Limitations of the Current Research . . . . . . . . . . . . . . . . . . .
7.2 Further Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
59
60
References
63
6.2
8
.
.
.
.
.
.
.
Chapter 1
Introduction
The i 11ted States has long been l)articularly mobile society. Overwhelmingly descended from immigrants, Americarns don't seem to stay rooted to their place of birth
at the saime rate as residents of most other countries. On multiple occasions throughout the nation's history-the era of west ward expansion, the Great MiIigration, the rise
of the sun belt literally millions of Americans have uprooted their lives and moved
west, north, or south in search of i(h(es, opportunit. or simply better weather. Although internal migration has decreased in recent years. the US still has one of the
highest rates of domestic migration in the world (Frey, 2009; Molloy, Smith, k Vozniak, 2011). Moving across the country for college, a job, or just a change of scenery
is notiinnsua.
This propensity to migrate is of particular interest now because of the rising
((C011011 importance of human capital. Increasingly, the knowledge spillovers created
by having iarge numbers of workers in the same industry chistered together are seen
as a imaor driver of long-term econormic growth. If a significant fraction of that talent
pool decides one day to leave, that may cast a shadow on the economic future of their
region. Alternateiv, if a region can increase its share of the migration take it riay be
able to improve its economic xellbeing. 1nder these cireunstances, understanding
how people move around the country and what drives that movement becomes more
than just a matter of intellectual curiousity. The economic vitality of cities and states
nav depend on it.
At the same time. in these days of ideological polarization and cultura1 fracturing,
migration flows offer a glimpse of the social lies that knit different parts of the country together. People are most likely to move to places they are familiar with, and
when they move they do not abandon all ties to their former home. A large exchange
of migrants between two regions therefore is both an indicator of existing social connection between them and a harbinger of stronger linkages to come. Examining the
full set of migration ties offers the possibility of identifying relatively self-contained
regions of the countries and the connecting cities that link them together.
This thesis explores domestic inigraiton in the United States with an eye toward
these two topics: human capital accumulation and social distance. Ut ilizing a large
and unusually complete dataset on county to county migration patterns from the
IRS, I look at the migration flows around the US and the cities they connect. I
9
begin by simply observing the patterns of migration, and coiparing them to a model
of what patterns might occur if there were no economic or cultural differentiation
among cities. Finding that such a, model cannot explain the observed patterns, I
shift to examining the connections that they illuminate among cities. I find that
certain cities occupy particularly central poistions in the migration network, serving
as hubs that connect otherwise separated parts of the country. Finally I investigate
more formally the boundaries of these regions, using the data to articulate migration
regions within the country and investigating the roles played by different cities within
those regions.
The following section situates this analysis by providing an overview of previous
studies of migration, as well as research on the importance of human capital and
social networks to economic growth. Chapter 3 discusses the data source used and
the process of preparation. The ensuing three chapters describe my investigations in
detail, outlining the methods used for each and the results found. I conclude with a
discussion of my findings, their limiations, and possibilities for further research.
10
Chapter 2
Motivating Literature
2.1
Investigations of Migration
N1odern study of (omestic
migration is generally acknowledged to have begun wvith
Ravenstein's study( of nigration patterns in nineteenth century Britain (Ravenstein.
1885). He formulated seven laws of migration, among- them that most igibrants tend
to move relatively short distances, that a flow of iglrants in one direction produces
a countervailing fhw ill the other direction, and that loiig-distace nigrniIants tend to
mnove to major cities.
Since then. scholarship on domestic migration has proliferated in a diverse array of
approaches, theories, and methods. The majority of work has attemripted to determine
the determinants of migration, at both the individual and aggregate levels. What
kinds of people inigrate, and why do they go where they do? A second, smnaller branch
of research has mattempted to address the consequences of migration for niigrants and
the places they encounter.
2.1.1
Determinants of Migration
At the ind1ividual level, migration has been widely conceptualized as a human capital
investment. This was first proposed by Sjaastad (Sjaastad, 1962) and is widely used
t(day. A person is hypothesized to migrate if the expected benefits to doing so, in
terms of increased wages or improved access to amenities, exceed the costs of the move.
The imrportance of wage increases has figured most, prominently in the literature
(Yezer k Thurston, 1976; Kennan & Walker, 2011), although there does appear to
be an initial period of wage decline folkowing a move (Grant K. Vanderkamp, 1980;
Borjas, Bronars, & Trejo, 1992). Some evidence has suggested that factors other than
employment prospects may in fact be dominant in the majority of cases (Morrison
& Clark, 2011)., or that the relative importance of employment and amenity factors
may vary over the lifecycle (Chen k&Rosenthal, 2008). Scholarship on demographic
characteristics ailn migration has found that the propensity to migrate tends to peak
in the 20s and 30s. and that the overall national migration rate has risen and fallen
on the hacks of generations (Plane &T Rogerson, 1991; Plane, 1993). Migration rates
have also been shown to increase with education level (Greenwood. 1997; Kodrzycki,
II
2001), and specifically, educated workers have been shown to be more likely to move
long distances to areas with better employment prospects (Wozniak, 2010).
Early economic analyses of migration at the regional level viewed it largely as a
means by which regional differences in wages and unemployment were equilibrated
(Courchene, 1970; Vanderkamp, 1971). If wages in one part of the country were
especially high, more people would migrate there until the labor supply increased
enough to bring wages down.
In the 1980s, an alterative view was proposed that argued that wage differentials
across regions are not evidence of a lack of equilibrium, but rather are compensating
for differences in amenities between locations, most notably due to climate (Graves,
1980; Graves & Knapp, 1988). In this view interregional migration is not a response
to disequilibrium in labor markets but rather reflects changing preferences for the
various consumption baskets offered by different cities. It is worth noting that the
equilibrium and disequilibrium approaches are not entirely contradictory-both almost
certainly operate at different points in time. Rather, the question is whether regions
are mostly in a state of equilibrium or one of disequilibrium. The equilibrium approach
also emphasizes the need to control for amenities in the destination region in addition
to economic variables, since high wages can be considered compensation for a less
appealing set of amenities (Hunt, 1993). However, both frameworks are called into
question by recent work by Kemeny and Storper that documents sustained regional
disparities in both wages and amenities (Kemeny & Storper, 2012).
Models of migration at the regional level in both the disequilibrium and the equilibrium framework generally model gross or net flows between regions as a function of
distance and of conditions at the origin and destination regions. Common explanatory variables include per capita income, unemployment rates, tax levels, measures of
government expenditures, and climatological variables (Schachter &T Althaus, 1989;
Treyz, Rickman, Hunt, &-Greenwood, 1993).
One possibility that has not received extensive amounts of study is that there may
be a role for path-dependency in domestic migration. In international migration, the
phenomenon of "chain migration" is frequently observed: the first few migrants to
a new country will settle in a given city essentially at random-sometimes almost
literally, as the government decides where to settle refugees. However, once a few
migrants are established there is a great tendency for further migrants from that
country to settle in the same place. The original settlers can assist newcomers in
adjusting to the new country, and can provide information about opportunities in
their new surroundings to acquaintances back home (Elsner, Narciso, & Thijssen,
2013; Massey, 1988).
Although the cultural differences in internal migration tend not to be as stark as
those involved in international migration, there is still room for information spread
through social networks to make a substantial difference in the decisions of migrants.
One attempt to model internal migration flows found that migration patterns in
1970 were a better predictor of migration in 2000 than both gravity models and
models based on other forms of connectivity such as air traffic (Andris, Halverson, &
Hardisty, 2011). Unfortunately this study was unable to conduct a. comparison with
a fully specified econometric model including all of the variables described above.
12
2.1.2
Consequences of Migration
Research on the regional impacts of internal nigration has been less subst antial than
that on its determinants. It is relatively difficult to miake causal inference about
the economic impacts of migration into a region, and there have been few attempts.
A larger number of studies have looked at the demographic impacts of doriestic
migration in terms of the net population change to a region or changes in the relative
size of different demographic groups. Overall net domestic migration is frequently
used as an ind(Pator of regional economic health in the popular media (Kotkin, 2012),
despite having systematic biases as a statistic (Rogers, 1990).
It has been docunented that regions that receive large numbers of immigrants
have tended to have substantial net domestic outmigration, especia llv of low-skilled
workers (Frey, 1994: Borjas, 2006). Beginning in the 1990s., this prompted concerns
about "demographic Balkanization," in which different regions of the country would
become increasingly culturallv and ethnically dist inct, dividing the country spatially
along racial, class, and ideological lines (Frey, 1995, 1996; Wrioht, Ellis,
1997).
[Reibel,
One consequence of migration that has attracted a great deal of concern from
plolicymnakers, especially in rural or economically distressed areas, is that of "brain
drain." In the domestic context, this is the net out-migration of a region's young,
college-educated workers. Due to the importance of human capital in stimulating economic growth (see below), the prospect of college graduates leaving town is frequently
viewed with alarm by policymakers, and many states and cities have implemented retention programs for their students, offering financial incentives to attend college in
the state or to remain after graduation.
Brain drain almost certainly does occur, in the sense that universitv graduates are
extremely mobile and many do move away after graduation (Hansen, Ban, & Huggins,
2003; Sanderson & Dugoni, 2002; McGuire, Hardy-Johnston, &KSaevig, 2006; Stricker,
2007). However, several studies have found a simall but existent relationship between
the location of a school arid the region where its graduates end up (Bound, Groen,
K6zdi, k Turner, 2004; Groen, 2004: Gottlieb & Joseph, 2006). Further, 20-29 year
olds nationally are the only age group with a strong net tendency to move into large
cities, moves that are balanced out by older residents leaving (Plane., Henrie, & Perry,
2005). This suggests that the proper way for municipalities to approach brain drain
may be to acknowledge that young educated workers are likely to move to a major
city as a natural part, of their life course, note that, attending college in an area in fact
makes them more likely to end up there than if they had gone to a different school,
and perhaps focus on attracting their parents.
13
2.2
2.2.1
Human Capital, Social Networks, and Economic
Development
Human Capital and Economic Growth
Studies of migration have gained renewed relevance in recent years with the increasing
emphasis given to the role of human capita] in economic development. Knowledge
spillovers stemming from the concentration of human capital are seen as one of the
primary mechanisms driving endogenous growth (Lucas, 1988; Romer, 1990). Cities
with high levels of human capital have been shown to grow more quickly than those
with lower levels, perhaps because they are more able to adapt to negative economic
shocks (Glaeser, Scheinkman, & Shleifer, 1995; Glaeser k Saiz, 2003; Black & Henderson, 1999; Gottlieb k Fogarty, 2003). This evidence has put a premium on high-skilled
workers, and urban policymakers have been encouraged to pursue growxth strategies
that attempt to accumulate human capital (Mathur, 1999).
The role of urban amenities in particular has received a, great deal of attention
from both academic and popular sources. High-amenity cities have been shown to
grow faster than those with few amenities (Glaeser, Kolko, & Saiz, 2001), and residential amenities have been found to impact the location decisions of firms (Gottlieb,
1995). Focusing on high-skilled workers specifically, Richard Florida has found them
to be concentrated in areas with high levels of diversity as well as amenities (Florida,
2002). Whisler et al. suggest that different types of amenities are important to
college-educated workers at different stages in the lifecycle, with young workers being
attracted to areas with large numbers of cultural amenities while older workers are
drawn to areas with low crime rates and mild climates (Whisler, Waldorf, Mulligan,
& Plane, 2008).
It should be noted that there is still very much a debate in the literature about
whether the causality described above-amenities attract, human capital which attracts
companies-is in fact correct, or if the causality is more or less inverted: strong career
prospects attract highly skilled workers to a region, who create a market for the
amenities. Some economic geographers, most notably Michael Storper, argue for the
latter scenario, claiming that firms at the forefront of innovation-driven sectors are
able to secure excess profits, which attract highly skilled workers who are able to
create the next round of innovations (Storper, 2010).
2.2.2
The Importance of Social Networks in Cross-Regional
Economic Activity
It is increasingly understood that economic activity is embedded in a network of social
relationships (Granovetter, 1985). People and places occupying central positions in
this network may be able to reap rewards unavailable to those on the periphery
(Borondo, Borondo, Rodriguez-Sickert, & Hidalgo, 2014). The importance of social
networks to employment prospects in particular is evidenced by the proliferation of
LinkedIn and the popularity of happy hours at conferences. But it was studied more
14
extensively by Mark Granovetter, who found that among white collar workers a large
proportion come) to their jobs via leads from their social networks (Granovetter, 1973).
Gr anovetter noted that most of these leads came from people the job seeker saw
relatively rarely. He suggests that these "weak ties" are extremely valuable, because
they connect people who for the most part occupy different social circles and are
thereftore exposed to different infornation and different opportunities. An individual
with many weak ties will be able to draw on a, much larger and more complete body
of awareness about what is goin g on in the world.
Inter-regional I migrants are likely to have large numbers of these weak ties. In
many cases they will leave a dense network of family and friends behind when they
move, a group they feel a strong connection with but no longer see very frequently. In
addition, they will create a new circle of social ties in their new home. 'They are thus
very well poised to act as a bridge between their old friends and their new neighbors.
The economic impacts of such bridges on their home regions can be powerful. In
the international context, entrepreneurs returning from Silicon Valley were instrumental to the development of the Taiwanese tech industr. largely because they were
alble to use their connections to and knowledge of Silicon Valley to identify areas
where Taiwan co1k1d play a complementary role (Saxenian & Sabel, 2008). A similar
pro('e'ss has played out in the emergence of Israel's tech industry (Senor & Singer.
2009). In fact, over the past two decades many countries have shifted from viewing
their expatriates as a national embarrassment to seeing them as an important economic resource that allows the country to tap foreign markets and investors (Ga men.
2011; Kuznetsov, 2006).
Even the domestic context, social ties are important to the developrent of businesses and industries. They are a primary mechanism 1y which innovations are diffused (Hagerst ra ind, 1966). Business owners tend to purchase from suppliers who they
know, and investors wvii-1 be predisposed towards cities they are familiar with (Pred,
1977). All of this means that regions wvith high numbers of informal social links to
other, prosperous regions--at what has been termed a low "social distance" (Andris,
2011)- may be likely to prosper as well. Practitioners of economic development are increasingly less concerned with retaining residents than with attracting large volumnes
of migrants from elsewhere, who bring with them new connections and ideas, keeping
the Intellectual ecosystem of the city vital (Piiparinen K; Russell, 2013).
The literature on humman capital, economic developmenit, migration, an1d social ties
suggest s that migration flows may not be simply a result of' econoumic conditions in the
affected regions. Rather, migration may also contribute to the economic development
of both sending and receiving regions, whether by supplementing the human capil al
stock of the receiving region or by creating social links between the two.
2.3
Mapping and Analyzing the Urban System
A number of researchers have attempted to map the economic and social connections
among cities over the years. The current round of interest in "city networks" can
probably be dated to Saskia Sassen's book The Global Cily, published in 1991 (Sassen,
I5
1991). Sassen noted that certain high-profile cities-most notably New York, London,
and Tokyo were increasingly becoming the command and control centers for the world
economy, and that in many ways they had stronger connections to each other than
they did to the rest of their home countries. Attempts to delineate world cities
as proposed by Sassen and explore the connections between them have used the
location of advanced producer services firms (Taylor, 2001). These advanced producer
services-accounting, corporate law, finance-were argued by Sassen to be the primary
mechanisms by which world cities achieve their prominent role. By examining where
such firms locate and looking at co-location patterns of their offices, Taylor was able
to define a. world-city system and articulate links between its component parts.
City systems did not begin with transnational producer services, however, and
urban system analysis did not begin with Sassen. In the United States, Allan Pred
created an extensive body of historical work on the formation of the US urban system
in the eighteenth and nineteenth centuries (Pred, 1977, 1973, 1980) and the spatial
diffusion of innovations (Pred, 1975, 1971). Other, recent efforts to map the connections between cities at both a. domestic and international level have utilized data
from air traffic patterns (Guimera, Mossa, Turtschi, & Amaral, 2005; Neal, 2010),
telephone calls, (Ratti et al., 2010; Calabrese et al., 2011), and the movement of
dollar bills (Thiemann, Theis, Grady, Brune, &-Brockmann, 2010).
2.4
Themes and Direction
The economic literature on migration has shed a great deal of light over the years
on the determinants of migration: what types of people tend to migrate, what drives
them to do so, and how they choose where to move. The empirical literature thus far
has had less to say about the impacts of migration, especially at the regional level.
However, some glimmers of a theory about such impacts emerge from the literature on
human capital and economic growth, migration, and social networks that are worth
bearing in mind. It seems plausible that, rather than being solely an effect of economic
prosperity in a region, migration may contribute to it. The increase in human capital
from in-migration to a region may endogenously contribute to economic growth in
the future. Less directly, it is possible that the social ties migrants establish between
their old and new homes may improve the economic prospects of each. Finally, if
individuals migrate only when it increases their overall well being, as hypothesized
by the human capital investment model, then regions with high volumes of migration
churn may enjoy a net benefit independent of any absolute gains in population due
to improved matching between residents and opportunities.
16
Chapter 3
Data Source and Preparation
3.1
Data Source
My primlary data source throughout this tIhesis is the IRS Statistics of Income
igra
tion Data. This dataset is put together each year based on address changes reported
-)n indiidu(Ial tax returns filed with the IRS. Dat a is a(Ore(ated to the county level.
For each county, the dathaset conlains inflows and outflows to each other county in the
country measured in three wars: the nurnber of ret urns, the number of exemptionsgeneraliy one exemption is clairned for each person in a filer's family-and the total
adjluste gross incorne. Data for a given pair of counties is suppressed ii there are
fewer than 10 returns (Internal Revenue Service, 2014).
This dat aset is extremely powerful because it is not based on a sample: the entire
population of tax filers in the United States is used. That allows researchers to
examine the flows of migrants in much greater detail than is possible with datasets
based on surveys or samples of the population, which tend to have large margins
of error for small geographic samples (Isserman, Plane, &- MicMillen, 1982). For
insta nce, the American Community Survey 2006-2010 5-year est imat es contain a table
of county-to-county migration flows. but the margins of error are generally larger in
imagnitude than the estimates.
The construction of the dataset does present a few limitations, however. Most
notably, since it is construct ed based on federal income tax returns, it excludes those
people who do not file income taxes. This means that these counts likely underrepresent the poor and the elderly, who are not required to file federal income taxes.
Additionally, this dat a is limited to returns that have been filed by Sept ember of the
filing year. This captures roughly 95-98% of all returns, but late returns tend to be
complex and to report high income, meaning that this data may underrepresent the
very wealthy as well (Gross, 1999).
In order to examine the conipleteness of the dataset, I compare it to population
data from the 2010 US Census. For each county in the 2009-2010 IRS dataset I sum
the number of exemptions reported as non-migrants and t he total in-migrants from
all other states and foreign countries. This gives the total number of exemptions for
taxes filed in the county in 2010-a rough approximation of the tax filing population
17
Figure 3-1: Histogram of Exemptions Compared to Population, 2009-2010
K
C)
07
0
0C\j
U-
0 -
.11.I
___________________________________
0.4
0.6
0.8
1.0
1.2
1.4
Ratio of Exemptions to Population
as of April 2010. I then calculate the ratio of this total exemption number to the
population in the 2010 census.
A histogram of this ratio for the entire country is
shown in Figure 3-1. On average, counties have about 79% as many exemptions as
they do people. The overall ratio of exemptions to people for the entire country is
79% as well. It is fair to say, then, that my dataset for 2010 includes exact data for
roughly 79% of the country, but misses completely the other 21%, who are likely to
be on average poorer and or more elderly than those present in my data.
18
3.2
Data Preparation
In this analysis, I use data from 2009-2010. The IRS dahta counts the number of people
who changed their address betwe(en the pre'vious year's and the current year's filings.
Here that means it generally captiures people who moved between April 15, 2009 and
April 15, 2010. Note that taxes filed in a given year pertain to income earned the
year before, so income reported as moving may actually have been earned prior to
3.2.1
Metropolitan Areas
I conduct my aialvsis at the metropolitan level. The rnetropolitan area-a core rinicipality and it's surrounding suburbs is generally considered the most logical definitiOn
of a "city" for the purposes of econonic analysis, since in many cases t he exact i-muniiipal boundaries are drawn for political or historical reasons that have little comnection
to current economic or social processes.
In particular, I aggregate counties into Core Based Statistical Areas, as lefinied
by the Office of Management and Budget. (BSAs are collections of comities chosen
to consist of one or more urban cores with at least 10.000 people along with the
SurrounIng areas to which t hey are linked socioeconomicallv, as measured by commnutincg pattems (Office of Management and Budget. 2013). As opposed to migration
between counties. where a person nmight move from the city to the suburbs but keep
the same job and social ties, migration between CBSAs can generally be thought of
as Involving a substantial change in a persons life.
Since eah county is assigned to at most one CBSA, it is fairly straight forward to
map each origin andldestinat ion colinty to its appropriat e CBSA and then aggregate
the county-to-county flows into (BSA-to-CBSA flows. This results in a dataset with
one row for each pair of CBSAs coutaining the number of returns, exemptions, and
gross in ome flowing in each dire tion.
CBSAs can be further divided into Metropolitan Statistical Areas (MSAs), which
have an urban core of at least 50.000 pe_ople, and Micropolitan Statistical Areas,
which have urban cores of between 10,000 and 50,000 people. For the rest of this
thesis, the terns CBSA., MSA, metro, and city will be used interchangeably.
3.3
Net, Gross, and Reciprocal Migration
The first quantity that generally comes to mind when considering migration flows to
and from a region is the net change. This is the bottom line-the overall impact of
irugration of a region's population-and it is certainly an important quantity to know.
However, one of the benefits of the IRS dataset is that it allows us to go beyond the
net changes to examine the gross flovs that, underlie them, which often paint a much
richer pictulire.
Consider New York and Boston. As illustrated in figure 3-2, from 2009 to 2010,
according to the IRlS data, 7035 people moved from the Boston MSA to the New York
19
Figure 3-2: Migration Between New York and Boston, 2009-2010
Boston
4
4
Net Migration
New York
Reciprocal Migration
MSA, while 7641 people moved in the other direction. That means that there was
a net flow of 606 people from New York to Boston. But these 606 people represent
just 8% of the people moving from New York to Boston, and just 4% of the total
people moving between those two cities. Focusing solely on net change ignores the
other 96% of people moving back and forth.
As another illustration, consider Boston's relationship with two other cities, Chicago
and Detroit. In 2010, Boston posted net gains of 211 people from Chicago and 243
from Detroit. But the net gain from Chicago was just 8% of the 2713 people moving
back and forth, while that from Detroit was a full 43% of the total migrants. Boston
clearly has a much stronger connection with Chicago than with Detroit, but focusing
solely on net migration misses that.
To fully access the information contained in migration data, it is necessary to move
beyond just net migration and examine the other components of the flow. In- and outmigration are relatively straightforward: the total number of people moving in each
direction. Some authors are increasingly promoting the use of gross migration, the
total number of people moving in both directions. Because this thesis views migration
as a measure of exchange and connection between cities, I will make extensive use of
a measure I call "reciprocal migration." This is the number of people who switched
places between two cities in a given year-the number that moved in both directions.
It is the complement of net migration, tracking all of the population movement that
didn't contribute to a net population shift. Using reciprocal migration isolates the
exchange part of migration from the directional movement portion.
The relationships among these various types of migration can be understood
through the equation:
G = I + O= N + 2 -R
Where G is gross migration, I is in-migration, 0 is out-migration, N is net mi20
gration, and [? is reiiprocal migration.
21
22
Chapter 4
Observed and Modeled Migration
Patterns
4.1
Metropolitan Migration Rates
In 2(H)9-2010 9,653,424 people (as approximated by tax exemptions in the IRS data)
moved across co(unnty boundaries in the United States. 5,280,236 of these moved to
another MSA.
Table 4.1 shows the top ten metro areas by number of migrants receive(l in 20092(110. Fr each metro, it lists the poputation of the MA5A, the number of migrants
who entered, the number of migrants who left, and the net population change due to
migration. This table brings home the distinction between net afnd gross mnigrat ion.
New York and Los Angeles received nore rnigrants than any other cities in the country
during 2009-2010. However, their net populat ion change due to domestic migration
was negative because even larger numbers of people left. On the other hand, cities
like Riverside, Washington D(, and Houston all posted substanti al population gains
due to inigration despite smaller overall flows than LA or New York. Miami and
San Diego form a third category, with hardly any net migration at all despite seeing
aImost I 00,000 people move in and out.
In addition, Table 4.1 shows that the net flows are quite small compared to the
total number of people moving. Los Angeles had the largest net outflow in the country,
and Riverside had the largest net inflow. In both of these cases the net change is less
than a quarter of the total flow in thait direction, and thiat prop-)lortion drops relat ively
quickly. On the whole, net population gains and losses due to migration amount to
just 7.1% of the total flows into and out of cities, and the average pairwise flow has
a net population shift accounting for just 16% of its total flux.
F(:cusing on population exchange, Figure 4-1 maps the total reciprocal migration
by metro. In this map as well as the many to folloy, the circles representing MSAs
are sized according to population. They are colored based on the variable indicted.
with blues indicating low values and reds indicating high ones. Here, we see that the
metros with the largest overall numbers of reciprocal migrants are Los Angeles, New
York City, and Riverside. New York and LA are not surprising considering that they
23
Table 4.1: Top 10 Migrant-Receiving Metros, 2009-2010
MSA
Population
In-migrants
Out-migrants
Net Migration
Los Angeles, CA
New York, NY
Riverside, CA
Washington. DC
Dallas, TX
Houston, TX
Phoenix, AZ
Miami, FL
San Diego, CA
San Francisco, CA
12,828,837
18,897,109
4,224,851
5,582,170
6,371,773
5,946,800
4,192,887
5,564,635
3,095,313
4,335,391
194,069
155,501
152,943
124,960
121,958
103,380
98,921
98,188
95,048
94,551
258,610
214,875
117,577
102.225
103,831
80,884
93,317
99,582
94,514
101,321
-64,541
-59,374
35,366
22,735
18,127
22,496
5,604
-1,394
534
-6,770
are the two largest cities in the country. Riverside is less expected, but over half of
its migrants are exchanged with Los Angeles. The other cities with high levels of
reciprocal migration are Washington DC, Dallas, San Francisco, Miami, San Diego,
Chicago, and Phoenix.
In addition to looking at the total numbers of people moving between cities, it is
instructive to consider the rate of migration relative to population. One can think of
a city's migration "churn rate" as the percentage of its population that migrates in
and out, over the course of a given year, independent of any net, changes in population.
Churn rate has no direct effect on the population size of a city, but places with high
churn rates will see large portions of their populations cycle in and out ea.ch year,
perhaps indicating greater dynamism.
Figure 4-3 maps the churn rates of metros. Two things stand out. The highest
churn rates are found in relatively small metro areas that are distributed seemingly at
random around the country. However, on closer inspection these appear to be metros
that contain military bases. Many of these metros are also among the cities whose
migrants travel the farthest average distances. This makes sense: there are a lot of
members of the armed forces, and they tend to dominate the rural areas surrounding
major bases. They are also likely to move frequently due to changing deployments
or discharge from service, and they often have little prior relationship with their
post,. Because these movements are not strictly economic decisions-army staff choose
where to send people based on military needs, and wages are not tied to being in
a particular location-they may have different economic impacts than economicallymotivated migration. However, if soldiers engage with the local communities they
may still form social ties that they maintain after their period of service.
Beyond the very high churn rates surrounding military facilities, some regions
of the country appear to have generally higher churn rates than others. The west,
especially the desert southwest, and Florida stand out for relatively high churn rates
across a number of metro areas. The northeast and Midwest appear to have lower
churn rates on the whole. Figure 4-3 displays box plots of the distribution of churn
24
Figure 4-1: Reciprocal Migration, 2009-2010
I
0
-~
*
.
00
ee
e,"'4
a gI
Figure 4-2: Migration Churn Rate, 2009-2010
A
0
loop*
25
Figure 4-3: Migration Churn Rate by Census Region, 2009-2010
0.045
0.040
-
0.035
2 0.030
-
D0,0255
C.2
0.025
-
r
~
0.020
0.000,
Moujntain
Pacific
South Atlantic
WS Central
New England
EN Central
ES Central
Mid Atlantic
WN Central
rates by Census region. The median western metro has a churn rate more than 50%
higher than the median metro in the West North Central region (the western Midwest
and northern great plains).
4.2
Individual Migration Flows
Narrowing the focus of analysis, Table 4.2 shows the largest individual flows between
pairs of cities. This list is dominated by pairs of large MSAs adjacent to each other,
and particularly by the flows between Los Angeles and Riverside, California (the
number of people moving between L.A. and Riverside is greater than the rest of the
top ten flows combined). In many of these cases, it can be debated whether the two
MSAs are truly distinct economic units. In fact, the OMB releases a list of "Combined
Statistical Areas" composed of groups of adjacent MSAs that have particularly strong
mutual ties. Of the top 10 flows, only those from New York to Miami and Philadelphia,
cross CSA boundaries.
This raises an interesting point. It is impressive that 90,000 people moved from
Los Angeles to Riverside in one year, but it is not necessarily surprising that large
numbers of people move between adjacent large cities. Figuring out whether the
LA-Riverside migration is notable requires a method to determine which flows are
substantially larger than would be expected a priori, and thus suggestive of unusually
strong economic or cultural ties between their sending and receiving cities, and which
can be explained due to geography alone. Identifying meaningful migration links
requires a "null hypothesis" for comparison.
How to formulate this null hypothesis is a difficult question. It doesn't make
sense to assume that migrants have an equal propensity to migrate between all pairs
of cities, but any departure from this should have a theoretical foundation. Fur26
Table 4.2: Top 10 Migration Flows, 2009-2010
Origin
Destination
Los Angeles, CA
Riverside, CA
San Jose, CA
Washington, DC
San Francisco., CA
Baltimore, MD
New York, NY
San Diego, CA
New York, NY
Philadelphia, PA
Riverside, CA
Los Angeles, CA
Scan Francisco, CA
Baltimore, Mi)
San Jose. CA
Washington, DC
Philadelphia, PA
Riverside, CA
Miami, FL
New York, NY
Migrants
Distance (km)
93,807
551729
20,014
18,732
18,120
16,936
16.870
16,179
15,374
13,964
194
194
125
89
125
89
156
179
1,743
156
ther, determining which factors explain two cities' inherent likelihood of exchanging
migrants and which ones are additional explanatory variables tlhat signify a special
relationship between the cities is somnething of a judgment call. If there is an interstate highway dir(ectly connecting two MSAs. should that go into the model as an
a priori predictor of increased riigration between them? What about if they are in
the same state. and thus share many of the sane political and economic institutions?
It is conventional wisdom that there has been a net migration of people to warmer
climates over the past several decades. Should temperature be considered an inherent
fact or in predicting migration relat ilnships?
Ultimately, I include popullat ion and dist ance as the exogenous factors that should
De expected to influence migration patterns independent of any special relationships
between cities. The population of a sending city determines the number of potential
migrants, while that of a receiving city is a reasonable proxy for its overall attractiveness as migration (estination the likelihood of having a relative or a job opportunity
there, for example. Following Tobler's first, law of geography, all else equal we would
expect a city to have less interaction with distant metros than with nearby ones.
4.3
The Radiation Model
Even with only two predictor variables there are a number of options for the functional
form of the "null hypothesis" model. Traditionally geographers have been fond of
gravity models, which take inspiration from the equations governing the gravitational
forces that all physical bodies impose on each other. The amount of interaction
between two cities is hypothesized to be proportional to the product of those cities'
populations divided by the distance between them raised to some exponent. Theory
is agnostic on what that exponent should be (in the case of physical gravity, it is
equal to two. but that exponent has not been universally observed among geographic
phenomena), and in practice it is often selected to fit, the data.
The gravity model has a number of appealing properties and it is in wide use
27
today. However, one weakness of it, is that while it incorporates information on the
population of the origin and destination, as well as the distance between them, it
does not include any information on what exists in the space between the cities.
The intervening features of a, landscape can have a strong impact on the amount of
interaction between two places, independent of distance. In Montana, high school
sports teams will travel hundreds of miles for a regular season game, because there
are so few towns in the intervening area. On the east coast that type of interaction
rarely involves a trip of more than a dozen miles, because there are so many nearby
alternatives.
One recent attempt to incorporate the population of the area between two cities,
and to remove the free parameter that is the exponent in the gravity model, has been
termed the "radiation niodel" (Simini, GonzAlez, Maritan, & Barabdsi, 2012). Rather
than using the analogy of gravitational forces, the radiation model treats migrants
(or commuters, or freight flows) as if they are physical particles. The probability of
a given particle colliding with another at a distance d is equal to the probability of it
hitting the latter while not having collided with any particle in the intervening space.
In the case of migration, we can imagine that a, person will move if she becomes
aware of an opportunity that is sufficiently superior to her current living situation
in terms of employment options, access to friends or relatives, access to amenities,
or other factors. Two assumptions about these opportunities underlie the radiation
model. First, it is assumed that the likelihood of there being such an opportunity
in a given city is proportional to its population.
Most of the factors that draw
people to migrate are in fact correlated with population, so this is a fairly reasonable
assumption. However, it does assume that the types of opportunities are distributed
uniformly across the country: if Chicago has twice the population of Minneapolis,
then it has twice the number of opportunities, period, and there's no reason why it
might be more or less attractive to certain subsets of the population. This is almost
certainly false, but it creates an effective null hypothesis. Second, it is assumed that
people will tend to move to opportunities closest to their current locations. This
could be because they are more likely to learn of opportunities that are nearby, or
because longer moves are more costly. The result of these two assumptions is the
assertion that people will only move if they find a sufficiently superior opportunity,
and that they will move to the closest such opportunity that they find.
To estimate the overall flow of migrants from one city to another, this decision
process is multiplied by the total number of people in the city of origin. The estimated
flow is thus directly proportional to the population of both cities: the city of origin
because its population is the pool of potential migrants, and the destination city
because the presence of attractive opportunities is correlated with population. The
estimated flow is inversely correlated with the total population of all cities that are
closer to the origin than is this particular destination, because people will only move
to the destination if they have not yet found a sufficiently good opportunity at a
closer location.
This last assumption is not entirely count erintuitive, but it does seem a. little less
straightforward than the rest of the assumptions underlying the model. People learn
about opportunities in all sorts of ways, and if they are already going to be switching
28
jobs they mlay not care how far they move. Still, if we limit ourselves to purely a
priori thinking based only on popullation and distance it seems reasonable to think
that people will mnore often tend to moXve to nearby places tha n to farther ones. The
radiiation miodel was originally developed for coiimuting flows, arid in that case the
assumptiol1nI isore innocuous: since people don't like to commut e, they will tend to
take the closest job to their house than is a sufficiently good amatch for their interests
and desired compensation.
Titirnately, V - the fraction of all migrants leaving city z who end up in city j, is
predieted to be:
A) , A
i-,jQj ( I-.I ]) j + Si)
Where pA, is the population of city k and sij is the total population of cities that
are closer to city I than city J is. This ultiimate expression is somewhat similar in
form to t hat of t he gravity model, but with sij replacing distance in the denominator
(along with pi and p]). Note that unlike the gravity model, the radiation model is not
symmetric: the predicted flow from city 1 to city j will not generlly be equal to that
from city j to city i. This is because while the cities and the distance dij between
them are constanlt, a circle of radius dij centered at city I will c(ntain different cities
is in a
I
with aI different total population than one centered on city j. So if'
to
migrants
total
its
of
peripheral location it is likely to send a higher proportion
centrally located cit jvthan city j will send back. However. it is possible that city j
rnay send more total migrants in which case the numerical flows may be similar.
4.3.1
Implementation of the Radiation Model
Although this study is primiarily concerned with migration among MSAs, I include
nlonnetropolit an count ies in this calculation to fully account for the intervening opportunities that are central to the radiation model. TO implemnent the radiat io n)model
I first find the geographic centroid for each NISA and no1netropolitall county. Note
that in some cases the M\SA centroid may be a somnewhat imprecise represent ation
of the of a netro's center of population. Many MSAs, especially in the western U)S,
contain geographically large counties that are sparsely populated. For exaniple, San
Bernardino County, CA, part of tile Riverside MSA, is larger that Vermont and New
Hampshire combined, but almost all of its population is concentrated in its southwestern corner. The centroid of the Riverside MSA is almost certainly substantially
north and east of the vast majority of its population.
Next I compute the great circle distance between each pair of centroids. I then
calculate the total population within that distance of each centroid based on the 2010
Census. That information is sufficient to calculate F as described above. I then
multiply F&
1 by the total number of migrants originating at city I in the 2009-2010
migration data.
I compute the radiat ion models predictions for 2009-201() migration flows.
29
Origin
Table 4.3: Largest Positive Radiation Model Residuals
Destination
Predicted Actual RIesidual
Los Angeles, CA
New York, NY
Riverside, CA
San Diego, CA
Miami, FL
New York, NY
San Jose, CA
New York, NY
Los Angeles, CA
New York, NY
4.4
Riverside, CA
Miami, FL
Los Angeles, CA
Riverside, CA
New York, NY
Los Angeles, CA
San Francisco, CA
Atlanta, GA
New York, NY
Orlando, FL
35,087
650
41,942
3,797
523
639
13,559
1,150
785
320
93,807
15,374
55,729
16,179
11,134
7,883
20,014
7,352
6,928
5,995
58,720
14,724
13,787
12,382
10,611
7244
6,455
6,202
6,143
5,675
Distance (km)
194
1,743
194
179
1,743
3,943
125
1,222
3,943
1,534
Radiation Model Results
Figure 4-4 compares the predicted results of the radiation model with those observed.
To measure goodness of fit I use the "common part"' statistic employed by Lenormand
et al. (Lenormand, Huet, (argiulo, & Deffuant, 2012). The statistic is computed as:
2. E E min (Aj, 1TJ,)
i=I 7 1 M
i=Ij-1
=1
Where n is the number of MSAs, ,ij is the observed migration flow from location
to location j, and AMfj is the predicted migration flow from location i to location j.
When the model and the have the same total number of migrants, as they do
here, the statistic measures the fraction of those that are correctly classified. The
common part statistic for the radiation model on 2009-2010 migration data is 0.527,
indicating that the radiation model is able to correctly classify about half of the
observed migration flows. This is a, reasonably good fit, considering that the radiation
model is parameterless-it, is based purely on theory and is not tweaked to reflect the
observed data-and that it uses only two variables to predict a noisy and idiosyncratic
process. The common part statistic of 0.527 is in line with values found by Lenormand
et al. in their evaluation of the radiation model across a number of commuting
datasets.
More interesting than the model's overall fit are the specific instances in which it
over- and under-predicts the flows of people. These represent connections between
cities that are either stronger or less strong than would be expected based purely on
the physical distribution of people across the country. Tables 4.3 and 4.4 show the
top ten positive and negative residuals from the radiation model.
Table 4.3 shows the largest positive residuals from the model. These are flows
that are much larger-hence representing stronger connections between their start and
end points-than are predicted by the radiation model. There are two main types of
130
Figure 4-4: Actual vs. Predicted Migration Flows, 2009-2010
105
U.
10 4-
-U-
-0
:3
*
102
4-J
10
-
100
0-2
I1,
10
102
10 5
104
103
Actual Value
Origin
Table 4.4: Largest Negative Radiation Model Residuals
Destination
Predicted Actual Residual
San Diego, CA
Riverside, CA
Las Vegas, NV
Los Angeles, CA
Phoenix, AZ
Colorado Springs, CO
New York, NY
Washington, DC
Phoenix, AZ
Tucson, AZ
Los Angeles, CA
San Diego, CA
Riverside, CA
San Diego, CA
Riverside, CA
Denver, CO
Philadelphia, PA
Baltimore, MD
Tucson, AZ
Phoenix, AZ
31
71,669
49,841
35,194
38,322
23,550
20,959
32,938
33,121
18,019
17,569
12,652
12,060
3,289
13,675
2,689
3,528
16,870
18,732
4,050
5,654
-59,017
-37,781
-31,905
-24,647
-20,861
-17,431
-16,068
-14,389
-13,969
-11,915
Distance (km)
178
179
212
178
405
69
156
89
121
121
flow represented in this table. The first are short-distance flows-generally less than
200 kilometers, meaning that they are between metros that, are essentially adjacent.
These flows are generally predicted to be quite large, but the observed numbers are
even greater. Los Angeles was predicted to send 35,000 people to Riverside, but it
actually sent over 90,000.
The second type of positive residual is perhaps more interesting from the perspective of determining the urban structure of the United States. It consists of longdistance flows, generally between major cities, that are predicted to be small but are
in fact quite substantial. The strongest of these link New York City to Miami and Los
Angeles, though also in the top ten are flows from New York to Atlanta and Orlando.
Outside of the top ten list some of the highest long distance residuals are found on
the flows from Dallas to Atlanta, New York to Tampa, Atlanta, to Miami, New York
to San Francisco, and Chicago to Phoenix and Los Angeles.
The negative residuals shown in Table 4.4 are somewhat more uniform in nature.
They tend to be the inverse of the first type of flow represented among the positive residuals: MSAs that are close together and that do exchange large numbers of
migrants, just not as many as the radiation model would predict. The lack of large
negative residuals at long distances makes sense given the radiation model's preference
for nearby opportunities: it is not going to predict large long-distance movements if
there are closer opportunities.
There are a few flows in Table 4.4-most notably those from Las Vegas and Phoenix
to Riverside where the metros in question are not actually adjacent but simply have
very few intervening opportunities (the primary one in these two cases being the
Mojave Desert). These two have the smallest observed flows as a percentage of
those predicted-less than 10% in both cases. They may be cases where distance or
physical barriers act as an impediment to migration even in the lack of intervening
opportunities. This phenomenon is illustrated even more clearly in the flow from
Honolulu to San Francisco, which is predicted to be 8817 people but in reality is just
758.
Possibly the most surprising residual in Table 4.4 is that between Denver and Colorado Springs. These are relatively large cities in the same state separated only by
seventy kilometers of fertile plains, but Colorado Springs sends an order of magnitude
fewer migrants to Denver than the radiation model would predict. This could potentially be due to cultural differences between the two cities: Colorado Springs is known
as a relatively conservative city, while Denver is generally seen as more progressive.
The largest positive and negative residuals involve the three major MSAs of southern California-Los Angeles, Riverside, and San Diego. The extreme size of these
residuals is likely due in part to the quirk in how MSA centroids are calculated described above. Because the MSA centroids are calculated using the full area of all
counties in the MSA, the model treats San Diego as being closer to Los Angeles than
to Riverside, and closer to Riverside than Riverside is to LA. It is difficult to precisely
define the distance between two cities, but by almost any measure this is incorrect:
Riverside and Los Angeles form one continuous urbanized area., while San Diego is
a hundred miles to the south. The driving distances between the downtowns of the
central cities of these MSAs are 54 miles from Los Angeles to Riverside, 98 miles from
32
Riverside to San Diego. an(d 120 miles from San Diego to Los Angeles.
This inaccuracy likely has a draminatic impa ct on the predict ions, because it casts
San Diego as a n intervening op)porturit y between Los Angeles aiid Riverside, and Los
Angeles as one between RIiverside and San Diego. Since these metros are each other's
nearest neighbors, this is the difference between having si be zero and having it be
12 million. Consider that the flow from San Diego to Riverside was predicted to be
just 3,797, while that from San Diego to Los Angeles-just one kilometer closer, and
with roughly three times the populationvas predicted to be 71.669.
4.5
Patterns of Residuals
Figure 4-5 firther investigates the relationship between extreme residuals and distance. It shows box and whisker plots summarizing the distribution of residuals for
different distance classes. Each distance class is labeled with its upper limit, so the
boxplot laleled "250" represents distances between zero and 250 km, that labeled
"500" represents dist ances between 250 and 500 km, and so on. Plot A shows these
plots for all residuals. While the vast majority of residuals in all distance classes are
clustered around zero, the extreme residuals are almost entirely positive at distances
creater than 1000 kilometers. Plot B further examines this trend bv Show\ inlg only
residuals with an absolute value over 500. These instances represent just 3 8% of the
mnigration flows, but account for 55% of the total error associated xvith the model. At
distances under 250 kilometers the median such residual is negative, alnost -1000.
For distances between 250 and 500 kilonieters the nedian residual is positive but
the interquartile range still extends to roughly -1000. At greater distances, though,
there are very few large negative residuals, ami there are none at all for disti ances
between 1000 and 2250 kilometers. This suggests that there are substantial numbers
of large, long-distance mioration flows, which are not incorporated into the radiation model's framework. Large residuals generated over dist ances greater than 250
kilomneters account for roughly 18% of the total error.
In addition to grouping residuals by distance, they can be grouped by metro
area. Because the radiation model directly incorporates the total number of outmigrants into its predictions, analysis of residuals grouped by the source metro is
not useful. But grouping residuals by destimation MSA provides a portrait of which
MSAs are more attractive than geography would predict. These are listed in Ta7ble
4.5, and the results are striking: some cities attract far, far more migrants than the
population distribution alone can account for. Dallas received almnost four times as
many migrants in 2009-2010 as the radiation model predicted, and many large cities
received more than double the amount. These cities may be places whose unique
characteristics make them stand out to mnigrants, offering something that cannot be
replicated elsewhere. Figure 4-6 maps the aggregated residuals for the whole country.
Besides the major MSAs in Table 1.5, there are large residuals found in several midsize cities in the southeast. The aggregate residuals for Los Angeles, San Diego, and
San Francisco are low compared to most other cities their size. Among large cities
Philadelphia stands out as one of the few where in-migration was substantially less
33
Figure 4-5: Residuals versus Distance
B: Residuals >500 by Distance
A: All Residuals by Distance
600 0-
j
2000
j
400 0
--
16
:3
-o
'U
1000
200 0
-
0
0-200 0
0
-
1000
.
'-
-
-400 0'0
-2000
-600
~/lflAPLI
0
250
I
500
050
0000
1250 1!500
1050 2000 2250
2500 0050
0
2000
Distance Between Origin and Destination (kin)
250
500
7500
1000
0250 1500 1750
000 2250
2500 2750 0000
Distance Between Origin and Destination (kn)
Table 4.5: Top Aggregate Residuals by Destination MSA
Actual Flow Predicted Flow Residual
MSA
Dallas, TX
New York, NY
Washington, DC
Miami, FL
Atlanta, GA
Houston, TX
Phoenix, AZ
Chicago, IL
Riverside, CA
Seattle, WA
121,958
155,501
124,960
98,188
90,857
103,380
98,921
87,986
152,943
75,297
33,263
71,915
46,983
22,494
22,565
38,330
44,619
35,663
104,963
31,138
88,695
83,586
77,977
75,694
68,292
65,050
54,302
52,323
47,980
44,159
than the radiation model predicts.
Taken together, the analysis of residuals by distance and by destination MSA
suggest that there are major structural features of the US domestic migration system
that are incompatible with the radiation model. This is not an indictment of the
model-as stated above, the purpose of using the radiation model here is to create a
null hypothesis of what migration would look like if it were influenced only by the
spatial distribution of population. The observed migration patterns differ from these
features, so other features of the urban economic and social system must play a role.
Determining the exact nature of these features is beyond the scope of this thesis.
Rather, I turn now to simply examining the structure of these linkages in greater
detail.
34
Figure 4-6: Aggregate Residuals by Destination INISA
4A.
* 0*
0@
36
Chapter 5
Centrality Analaysis of the Migration
Network
Thus far this thesis has documented and analyzed the patterns of individual migration flows betveeen (cities, and det ermined that they conformi to a structure more
compy)ex than can be explained by the geographic distribution of population alone.
Going forward I shift to an analysis of this structure, remainning agnostiC on what
factors might influence the patteiii of flows that is observed and instead examining
the network of ties that they create. In this section. I use measures of centrality from
network science tc) explore the relative positions of various eities within the domestic
migration network. Central cities will tend to have stronger ties to a more diverse set
of MSAs. Because the emphasis here is on the strength of connection, and for ease of
cornputat ion of certain metrics, I use reciprocal migration as my primary variable of
interest.
5.1
Degree Centrality
There are a number of measures of centrality cornmonlv used in network science.
The most straightforward of these is degree centrality. The degree of a node is defined as the number of c(nnections it has to other nodes. and a central node is one
that has strong direct connections to lots of other nodes. Degree carn be measured
without weighting, simply counting the total number of other nodes a given node is
connected to, or the measure can be weighted by the strength of the connection. In
the case of migration, unweighted degree measures the number of cities that a given
metro exchanged migrants with, while weighted degree measures the total number of
reciprocal migrants that it had.
Both of these measures are useful. The unweighted measure gets at the sheer
geographic reach of a city, while the weighted measure describes the total volume
of mniigrant turnover the total amount of exchange that's happening. However, each
measure on its own provides an incomplete picture of what's going on. A high un-
weighted measure may originate with a city that, exchanges just one migrant with a
large number of metros. In that case it may be wrong to conclude that such a city has
37
Figure 5-1: Unweighted Degree, 2009-2010
le
0
0.(
a particularly large amount of interaction with the rest of the country. On the other
hand a high weighted measure may document a city that exchanges an enormous
number of migrants with just one other region.
Figure 5-1 shows the unweighted degree of US metro areas. Again, circles are sized
proportionally to their population, while the color ranges from low scores in blue to
high ones in red. Phoenix has the highest degree-it exchanged migrants with 368
other metro areas in 2009-2010. Following it are San Diego, Los Angeles, Chicago,
Las Vegas, and Houston. With the exception of Chicago, these are all Sunbelt cities,
and many have reputations as retirement communities or job magnets. The high
unweighted degrees of these cities indicate that they are able to attract, migrants
from a diverse range of communities at the national scale. However, it is worth
keeping in mind that these metros exchange migrants with all of those communities;
they don't just receive them. So while they are receiving migrants from all over the
country, they are sending people to those communities as well.
Weighted degree is the same as total reciprocal migration, and it was already
mapped in Figure 4-1. Compared to unweighted degree, the weighted degree is more
concentrated: Los Angeles and New York are in a league of their own, with Riverside,
Washington DC, and Dallas substantially further behind. New York is an interesting
case because its unweighted degree is relatively low, implying that it is exchanging
larger numbers of migrants with a smaller set of cities.
Figure 5-2 plots degree versus weighted degree. There is a fairly strong correlation
between the two at lower levels that opens up a bit among the most connected metros.
38
Figure 5-2: Degree vs. Weighted Degree, 2009-2010
Figure 5-3: Population vs. Weighted
Degree, 2009-2010
200000
10
150000
10,
Mets
-____________________________
t10t
10000
-.
01
10-
1'.
I0S00
O0
200
250
300
050
00
w
Unweighted Degree
o,
0
Pnoulti0,
i\Jetros that, exchange migrants with more than, 100 other cities tend to have a, higher
weighted degree than would be predicted based on the trend among less-connected
metros, but there is also substantial variation among them. New York and Los Angeles
have particularly high weighted degrees for their levels of 1unweighted degree. while
Phoenix and Las Vegas have lower weighted degrees than are typical for their levels
of unweighted degree.
Figure 5-3 plots population versus weighted degree. The relationship is linear, and
striking-the correlation between population and the number of reciprocal migrants is
0.92.
5.2
Closeness Centrality
A second measure of centrality that is frequently used is closeness centrality. In an
unweighted network, this measures the average number of steps it takes to get from
a node of interest to all other nodes in the network. This measure thus takes the
degree of the node into account, but it also incorporates the degrees of the nodes it
is connected to. If a. node has only one link, but that link connects it to a hub, it can
still claim a, high closeness centrality score.
In the case of weighted networks such as this one, this measure becomes somewhat
more complicated. How should one calculate the average number of steps when some
links carry far more people than others? One approach draws on the physics of
electricity (Brandes & Fleischer, 2005). Electric circuits are often constructed with
parallel paths that have varying resistances. When this occurs, it is not the case
that all of the current flows through the path with the least resistance. Rather, some
current flows through each path, with the exact amount proportional to the inverse
of the resistance.
We can conceptualize a weighted network as an electric circuit, with the weights
(the numbers of migrants in this case) being thought of as the "resistance." Then
the analogue of the distance between two nodes is the effective resistance of all paths
connecting them. This yields as measure known as the "current flow closeness cen39
Figure 5-4: Closeness Centrality, 2009-2010
K%
trality," which is equivalent to "information centrality," a measure first proposed in
the 1980s that has been infrequently used due to its unintuitiveness (Stephenson &
Zelen, 1989). Nodes that score high on this measure can be thought of as being near
the center of the network in the sense that they communicate relatively well with
most other points within it.
Figure 5-4 shows the current flow closeness centrality of the US reciprocal migration network in 2009-2010. What is immediately striking about this map is the sheer
amount of red-there are a lot of metros with very high closeness centrality, including
almost all the major population centers. This suggests that the web of migrants tying
the country together is relatively complete, especially among populous metros. If a
large metro area, doesn't have a direct connection to a given city, it is likely to have
a very strong connection to another metro area that does.
The metros showing low closeness centrality tend to be micropolitan areas, especially in the upper Midwest and interior south. These are places that exchange only
a few migrants with a small number of places, none of which are very well connected.
One interesting feature of the map is that there are extremely peripheral metro areas
in physical proximity to major centers. Southern Georgia, for example, has several
peripheral nodes sandwiched between Atlanta, Jacksonville, and Augusta, all of which
are quite central. This affirms that physical proximity does not necessarily imply a
high degree of social connectedness.
Figure 5-5 plots closeness centrality against population. The strong but nonlinear relationship between population and closeness centrality stands out here. In
40
Figure 5-5: Closeness Centrality vs Population
0O
%
Ln
V)
1
0ee
U
@0
101 14
10
10 5
106
Population
41
10
particular, the closeness measure plateaus once the metro population reaches one
million-every single metro with more than I million people is also at the very top
of the closeness centrality measure. This implies both that the large metros are
extremely well-connected and that they are roughly equally connected. The overall
correlation between closeness centrality and population is 0.36.
Whether closeness centrality as measured here is meaningful sociologically is an
open question. If LA exchanges many migrants with New York, which exchanges lots
of migrants with Buffalo, does that. establish a, meaningful proxy connection between
LA and Buffalo? It's unlikely that many individual people moving from Buffalo to
New York continue onwards to Los Angeles. But perhaps coexisting with Buffalonians
creates in New Yorkers a base level of cultural awareness about that city1 that they
bring with them to Los Angeles.
The lack of differentiation in closeness centrality among large metros is initially
frustrating: how does one determine if San Francisco or Houston is more central? But
that may be the point: with advances in communications and shipping technology,
specific location has become somewhat less important, at least among a certain set
of large metropolitan areas. Outside of a few extremely elite or innovative centers,
many locations are somewhat interchangeable. A national company looking to site
a new mid-level management facility may be equally willing to put it in suburban
Providence or suburban Denver. But it will think twice about putting it in suburban
Duluth. The closeness centrality measure appears to capture this distinction, and
could be considered as a means to delineate the "metropolitan core" of the country-
the metro areas from which it is essentially equally possible for a typical business to
participate in the national economy.
An alternative view of this measure is that living in Boston it is not altogether
unusual to come across someone from Minnesota. But it is surprising to find a Minnesotan from outside of the Twin Cities metro area. Much of that is no doubt due to
the sheer population share of Minneapolis-St. Paul within the state, but some may
be because the cities are more integrated into the national migration network than
the rest of the state. Perhaps the set of high-closeness cities can be considered an
approximation of the set, whose members one expects to come across in everyday life
in a, given member city. Again with such a strong relationship between closeness and
population, it's hard to parse what effects are driven by population and what are
about position in the migration network.
5.3
Betweenness Centrality
A third frequently used measure of centrality is betweenness centrality. This attempts
to measure the extent to which a given node acts as bridge between otherwise disjoint
parts of the network. In the unweighted case, this is done by finding the shortest
paths--those involving the smallest number of steps-between all pairs of nodes in the
network. The betweenness centrality of each individual node is defined as the fraction
of all shortest paths that pass through it. A node with high betweenness is "central"
'Such as whether Buffalo buffalo Buffalo buffalo buffalo do in fact buffalo Buffalo buffalo.
42
Figure 5-6: Betweenness Centrality, 2009-2010
I
lz
10
*.
00
0
in the sense of being integral to the network: if it, is removed, it becomes substantially
harder for the rest, of the nodes to communicate with each other. In the case of ain
airline network, the nodes with high betweenness are the hubs the places one has to
travel through to get, fron Ipoint A to point B.
The interpretation of betweenness centrality is less straightforward in the ease of
mnigration. Unlike in air travel, migrants are not constrainled to flow via, the links in
the network. There is no sense in which someone plann11ing t~o migrate fromn Boston
to Seattle has to stop in C"hicago first, and there is no reason to expect that many
of the people moving to Chicago from Boston will continue to Seattle in a, few years.
Howvever, cit ies with high measures of betweenness centrality can still be thought, of
as hubs in the sense that, they exchange large numbers of mnigrants with parts of the
country that, do not, have extensive (lirect interaction. High- betxveenness cities are
cosmnopolitan centers, welcoming migrants from all regions of the country.
As with closeness cenrtrality, the presence of weighted edges makes computing
betweenness more difficult. Again, the electric current analogy is used to adjust for
the number of migrants flowing on a, given link. It also allows for the incorporation of
more than just the single shortest path between any twxo nodes. The electric current
eqluivalent of the fraction of shortest paths between node A and node B that pass
through node C. is the fraction of a, unit, AB current that passes through C. With this
fraction defined for each pair of nodes it, is straightforward though computationally
intensive to calculate the overall betxveenness centrality.
Figure .5-6 shows the bet weeniness centrality of' metro areas based on the 200943
Figure 5-7: Betweenness Centrality vs.
Figure 5-8:
Population
Betweenness Centrality
10
Closeness Centrality vs.
10
Betweenness Centrality
Popo1tion
2010 reciprocal migration network. As opposed to the closeness centrality map, here
there is very little red.
Only a few cities have high betweenness, and there is a
fairly strict hierarchy among them. Chicago has by far the highest betweenness
score. This distinction aligns with its role as a transportation and freight center, and
with its reputation as the hub of the United States. Dallas has the second highest
betweenness measure, and it shares many characteristics with Chicago: they are both
large metros in the center of the country that have extensive transportation links
with the rest of the US. It is interesting to find that Dallas has a higher betweenness
rating than Houston, since they are similarly sized and Houston has a reputation of
being a stronger attractor of international migrants.
New York City ranks third in betweenness, followed by Los Angeles, Atlanta,
Phoenix, Houston, Washington DC, and Minneapolis. Though they are situated on
the coasts, New York and LA are the largest two cities in the country, the cultural and
economic centers of the east and west coasts. As such, it makes sense that they draw
and send migrants from a. diverse set of regions. The rest of the high-betweenness
cities show a combination of high population with a central or Sunbelt location.
Figure 5-7 plots betweenness centrality versus population2 . Even more than with
closeness centrality, there is an extremely strong, almost linear relationship between
betweenness centrality and population, with a correlation of 0.86.
Figure 5-8 plots closeness centrality against betweenness centrality. The pattern
is almost identical to figure 5-5, with a positive relationship between betweenness and
closeness centrality among metros with lower levels of both that flattens out near the
top. The overall correlation here is 0.46.
2 For
ease of viewing I have excluded the Thomaston. Georgia MSA from figures 5-7 and 5-8. At
6.8 -10-17 its betweenness centrality is twelve orders of magnitude smaller than that of any other
observation.
44
Chapter 6
Detecting Communities in the
Migration Network
A second major area of network analysis is the identification of communit es within
networks. Here the goal is to partition the network into conininnities of nodes such
that the connections within ea h group are much stronger than the conle( tions )(tweein groups. In the case of social networks, each comnmunity can be thoulight ofas a,
friend group or social c(irce-a set of people with tight ties to each other and weaker
ties to the rest of the world. In the context of migration networks, cornmunities are
regions of thlie counitry: groups of cities that exchange many migrants with each other
and do not send as many to other parts.
It is not a giveln that these migrat ion regions will share common cultuiral attributes
or econoiiic linkages. But migration will likely be correlated with both of these types
of connect ion. Pe p leaving their honmetown will, all else equal, probably be drawn
to culturally similar cities. Once they arrive in their new town, they coitribute
to a cultural exchange between it and their city of origin. Similarly, as described
above economiists have generally found economic considerations to be paramount in
determining propensity to migrate. Cities with strong econonic connect ions to each
other--branch locations of the saime companies, high volumes of trade-will have more
opportunities for migration. And this migration is self-reinforcing since people are
more likely to make investments in or transact with companies in places that they
are familiar with.
6.1
6.1.1
Approach to Community Detection
Modularity
How to best partition a network into coniunities is still an open question in network
an connmunity detection algorithms work on the principle of modularity
science.
optimization. The "modularity" of a network partition is computed as the fraction of
links that fall within the proposed groups minus the fraction that would be expected
if links were distributed at random (Newman k&Girvan, 2004). Modularity can range
45
from - to 1. A score below zero means the partition is terrible more links cross
proposed boundaries than wouldi be expected if the lines were drawn at random.
A score of zero means that the partition is exactly in line with random chance: the
fraction of ties that cross conmmnity boundaries is exactly what chance would predict.
Positive values of modularity denote increasingly good partitions. For most networks
a good partition will result in a modularity score between 0.3 and 0.7.
6.1.2
The Louvain Algorithm
Even with a well-defined measure of partition quality, finding the optimal partition
of a network is extremely difficult. The sheer number of possible partitions makes
it impractical to search all of them looking for the best one. Instead, a number of
algorithms have been proposed that attempt to find an approximation of the optimal
partition.
Here I use the Louvain method for community detection (Blondel k Guillaume,
2008). This algorithm is noted for its low computational complexity.
It begins by
assigning each node to its own community. The algorithm then iterates over the
nodes. For each node, it calculates the change in modularity that would result from
assigning the node into each community found among its neighbors. The node is then
assigned to whichever community most increases the modularity.
The algorithm iterates through all nodes and repeats until it makes a full pass
without reassigning any node. Then it creates a new network with one node for each
community from the previous phase. It, then repeats the entire process until it reaches
a point where the modularity begins to decrease. What makes the algorithm so fast
is its "greedy" nature: it never takes back an earlier step trying to improve the overall
results, even though there are almost certainly times when that would be desirable.
6.1.3
Initial Partitions
Figure 6-1 shows the result, of one run of the Louvain algorithm on the migration
data, for 2009-2010. The country is partitioned into seven communities, each occupying a, different region. The entire western third of the US forms one community,
encompassing everything from California to Colorado. A second large region might be
termed "Greater Texas," containing all of that state as well as Oklahoma, Louisiana,
Arkansas, Missouri, Kansas, and New Mexico. The Upper Midwest forms a third
community, centered on Chicago and M\inneapolis. Michigan, Indiana, and Ohio
comprise the core of a community in the eastern Midwest, which extends to include
Pittsburgh. The eastern seaboard from Washington DC to Maine makes a fifth community. The final two communities are in the South. One is centered on Virginia,
and the Carolinas, while a, second contains everything from Mississippi to Kentucky
to Florida.
One interesting feature of these communities is the fact that each of them is spatially contiguous. There are no enclave cities located within one region but tied most
strongly to the nodes in another. This is not necessarily unexpected-it makes sense
that proximate places should exchange more migrants-but it does emerge strictly
46
Figure 6-1: Community Detection Round 1, 2009-2010
10
F,
C
V
-j
from the data. There is nothing in the community detection algorithm that, requires
nodes in the same community to be located next to each other.
Unfortunatelv. identifying "true" or "correct" communities in a network is not this
straightforward. Figures 6-2 and 6-3 show the output from two further runs of the
Louvain algorithm. This is the exact same algorithm run with identical settings,
differing from Figure 6-1 only in the order through which the nodes are iterated. The
results are not entirely dissimilar from those in Figure 6-1, but the differences are
substantial. The total number of communities changes from run to run, the relative
sizes of each community shift substantially, and sometimes enclaves even appear.
Further runs of this algorithm produce continued variations in the output. There
are consistent features: the western states always form one community, Texas is at
the center of another, the northeast a third. But the details change substantially
from one run to another. The ultimate number of communities varies from five to
seven, and even in the most stable parts of the country the community boundaries
are always in flux.
This variability in results is partly due to the approximate nature of the algorithm
used. There is randomness in the approximation process, so it is natural that repeated
iterations produce slightly different results. A more thorough algorithm might be able
to hone in on the one partition that truly maximizes the modularity score. But the
true problem lies at a deeper level, with the modularity score itself. The modularity
function has been found to have a fairly flat surface (Good, Montjoye, & Clauset,
2010). In many cases, there is no obvious peak of modularity that, defines a clear
47
Figure 6-3: Community Detection
Round 3, 2009-2010
Figure 6-2: Community Detection
Round 2, 2009-2010
optimal partition. Rather, there are often a wide variety of partitions with very
similar modularities. This is what we observe here. Running the Louvain algorithm
100 times produces 100 substantially different partitions, whose modularity scores
vary only from 0.394 to 0.411. The standard deviation of modularity in this sample
is only 0.0037. Yet the variation of proposed communities is extensive.
Given that such different partitions can have such similar modularities, it becomes
hard to argue that modularity maximization alone will find the single most optimal
partition. Good et al. propose a number of alternatives to pure modularity maximization. These include combining information from many distinct high-modularity
partitions, attempting to estimate the statistical significance of a, partition, and using
generative models to account for overlapping communities.
6.1.4
Repeated Louvain Runs
The approach that I take to overcome the difficulties of modularity optimization is
relatively straightforward, and relies on combining information from multiple highmodularity partitions. The results of the Louvain algorithm differ from run to run,
but there are common features throughout. The core cities in many communities do
not change over time, and there are groups of metros that stay together even as they
flip from one community to another. To more completely observe these subtleties
of the migration community structure, I run the Louvain algorithm repeatedly and
examine the co-occurrence of various cities within the same community. This approach turns the approximate nature of the Louvain method into an advantage, using
the randomness it induces to more thoroughly explore the various high-modularity
partitions. It is similar to the approach taken by Thiemann et al. in their analysis of
money circulation (Thiemann et al., 2010).
Figures 6-4 and 6-5 show the results of this approach applied to 100 runs of the
Louvain algorithm. Figure 6-4 draws lines connecting metros that appear in the same
community at least 95 percent of the time. These groups of metros are the building
blocks of the communities found throughout process, the pieces that do not come
apart.
There are two major takeaways from this map. First, there are some communi48
Figure 6-4: Metros Co-Occurring 95% of the Time
ties that really are extremely cohesive. The western states essentially always form
one community, and the boundaries rarely change except to include or exclude New
Mexico. The line separating the west from the rest of the country is particularly
clear cut. Similarly, the Upper Midwest community, comprised of Illinois, Wisconsin,
Minnesota, Iowa, and the Dakotas, is present in virtually the same form 95 percent
of the time. The northeastern states (excluding western Pennsylvania) and Texas are
other areas that form consistent, large comiunities.
In the eastern half of the country there are fewer consistent large communities
of metros. Instead, there are many groups of 10-20 metros that are consistently
in the same community. In many cases these groups conform fairly well to state
boundaries. Georgia, Florida, Alabama, Mississippi, Kentucky, and Nebraska all
have self-contained groups that include almost all of their metros, while Missouri and
Kansas form one group. Additionally, it is interesting to note that there are very few
cities that do not share a community 95 percent of the time with at least one other
metro area. Most cities have fairly tight relationships with at least one other place.
Figure 6-5 shows the overall pattern of community formation over 100 runs of the
Louvain algorithm. Here the darkness of the line connecting two metros is proportional to the fraction of the time that they fall into the same community. The "building
blocks" apparent in Figure 6-4 begin to coalesce into larger groupings. The statesized communities in the Deep South-Alabama, Mississippi, Tennessee-are seen to
frequently form one larger mid-south community. Similarly Oklahoma and Arkansas
are joined with Texas almost all of the time, and Kansas-Missouri only slightly less
49
Figure 6-5: Community Co-Occurrence
frequently. Perhaps most notable are the strong connections up and down the east
coast, linking Florida to the Northeast.
The relationships diagrammed in Figure 6-5 can be used to weight each edge
by the percentage of the time it crosses a community boundary. This arrives at
an approximation of the percentage of a given city's migrants who come from a
"different community," even without strictly defining the communities. For analyses
that depend only on the percentage of a city's migrants that come from outside of
its community it is perhaps more accurate to use this measure than to impose one
formalized partition.
For some analyses, however, it is helpful to have one strict community partition.
This can be developed by running the Louvain algorithm on the community cooccurrence network. Figure 6-6 shows the result of this approach. The communities
found in this map are quite stable: repeated runs of the Louvain algorithm produce
almost no variation in the communities found (the only change that occurs is that
Kentucky tends to switch between the Mid-South and East Central communities).
The end result of this procedure is a partition of the United States into six migration regions. These are the West, including all states west of the continental divide
as well as Alaska and Hawaii; Greater Texas, which stretches as far as Louisiana,
Missouri, aid Kansas; the Upper Midwest, including Illinois, Wisconsin, Iowa, Minnesota, and the Dakotas; an East Central region comprised of Michigan, Indiana,
and Ohio; the Mid-South of Mississippi, Alabama, Tennessee, and Kentucky; and the
East Coast from Maine to Florida. The modularity of this partition is 0.41. That
50
Figure 6-6: Formalized Communities
0
>J~; (~~}
K
~
u
'\~
K
-
U
tx.'
7;!>
K
(<)
Cr7
-
0
(
~
a
>-
(Ay
N~ (J
CO
)(CJ
\
;~)
>E~d
isn't as high as the highest of the individual runs, but it is in the top 5%.
6.2
Community Roles of Individual Metro Areas
Having arrived at a partition, a natural next step is to investigate the roles played
by different cities within the community structure. Do certain cities dominate their
communities, either by monopolizing migration within the community or by being
its primary link to the rest, of the country? This analysis of within-community roles
complements the centrality analysis of the full network conducted above.
6.2.1
Extra-Community Degree
First I examine the extent to which different cities are connected to metros beyond
their immediate community. This is calculated by weighting the number of migrants
exchanged between two metro areas by the percentage of the time they are in different
communities as displayed in Figure 6-5.
Table 6.1 displays the top 10 metro areas in terms of this extra-community degree.
Chicago tops the list, exchanging almost three-quarters of its reciprocal migrants with
cities outside the upper Midwest. Los Angeles and New York are next, although
their extra-community migrants make up a much smaller fraction of a larger total
number. Comparing extra-community to overall weighted degree (mapped in Figure
4-1), Chicago is far more prominent here. Additionally, Riverside has dropped from
51
Table 6.1: Top 10 Extra-Community Migration Hubs, 2009-2010
MSA
Population
Extra-community Degree
Total Weighted Degree
Chicago, IL
Los Angeles, CA
New York, NY
Dallas, TX
Atlanta, GA
Washington, DC
Phoenix, AZ
Houston, TX
San Diego, CA
Miami, FL
9,461105
12,828,837
18,897,109
6,371,773
5,268,860
5,582,170
4,192,887
5,946,800
3,095,313
5,564,635
59,914
49,202
48,013
43,146
38,830
38,329
33,264
31,028
29,673
24,946
83,655
188,467
153,050
95,159
72,252
95,434
82,350
75,998
84,420
87,077
having the third highest total degree to not even making the top 10 in terms of
extra-community degree (it comes in at number 19), because so many of its migrants
are exchanged with Los Angeles. New York occupies roughly the same position in
extra-community degree as it does in weighted degree, while Washington DC, Dallas,
and Atlanta are more prominent in terms of extra-community degree than they are
in pure weighted degree.
On the whole, extra-community migrants are more concentrated in a, few major
cities than are migrants on the whole. For instance, the top 10 cities account, for
37% of the extra-community migrants, compared to just 23% of the total migrants.
Figure 6-7 shows the rank-ordered distribution of population, weighted degree, and
extra-community degree. The x-axis is the metros ranked by the variable of interest
(note that the ordering is different for each variable), while the y-axis shows the
percentage of the total found in each metro. Population and total weighted degree
follow a similar pattern, decaying rapidly through the first two hundred or so metros
and more gradually after that. Extra-community degree, on the other hand, decays
far more quickly at all levels of the distribution.
There is a very strong correlation between betweenness centrality and extracommunity degree, 0.94, compared to only 0.86 between betweenness and total weighted
degree. This is not particularly surprising: it makes sense that cities with many connections to metros outside their immediate community should occupy more central
locations in the overall network.
6.2.2
Community Diversity
Using the formalized communities shown in Figure 6-6 it is possible to examine the
distribution of cities' migrants across the different communities, seeing not just how
many migrants cross community boundaries but also where they go. Figure 6-8
'These numbers are calculated by dividing each city's degree by the sum of total degree across
all cities. This effectively counts each reciprocal migrant twice, once at each city it connects.
52
Figure 6-7: Ranked Distributions of Weighted Degree, Ext ra-Communmity Degree, and
Population
101
Population
All Migrants
Extra-Community Migrants
10 2
4-1
0
0
oc'
(0
4
10
-1
10-6
0
200
400
600
Metro Rank
53
800
1000
Figure 6-8: Out-migrants by Community of Destination, Selected Metros
Los Angeles, CA
New York, NY
Riverside, CA
Chicago, IL
Upper Kd
West
V
- -W-
~adeout
Wd-saou% West
Eat.
MId SOuth
C-at
CoastEau
East
C
Md5-
W
Dallas, TX
We-ter Te-
C"MsW
Washington, DC
est
San Francisco, CA
fte
Cd
Total Population
Eav~e
on
(II...T..
-
shows the breakdown of communities represented among migrants leaving the top
seven cities, along with the total population breakdown for comparison.
Among these cities, Chicago stands on its own in terms of cosmopolitanism: it
sends substantial numbers of migrants to five of the six communities, and sends
almost as many migrants to the East coast as it does to its own community. Dallas
and to a lesser extent Washington send more than a quarter of their out-migrants to
communities besides their own, while Los Angeles, New York, and especially Riverside
primarily send migrants to their own community. Notably, for both New York and
Washington the West is the second largest destination for migrants, even though it
is geographically the most distant.
To more formally examine the inter-community migrationsheds of each city, I
compute a "participation coefficient" for each city. This measure, used by Guimera
et al. in their study of air traffic networks, measures the extent to which a city's
migrants are spread across all communities (GuimerA, et al., 2005). It is computed
for node i as:
Pi
= 1 - N
2
Where ki, is node i's degree in community s, ki is node i's total degree, and NAI
is the total number of communities. This index will be zero if all of a node's links are
within its own community and will approach one if a node's links are spread evenly
across all communities.
Table 6.2 shows the top ten cities by participation coefficient. While Chicago
ranks highest in terms of participation coefficient, the list, is not dominated by the
large metro areas that score highest on extra-community degree. Rather, it contains
several mid-size metros in the Midwest and South, especially near the borders of the
Mid-South, Greater Texas, East Coast, and East Central communities. The Mid54
'Table 6.2: Top 10 -Metros by Cornmunity Pariicipation CoeffcT,
MSA
Comnmunity
(hicago, IL
St. Louis, MO
Clarksville, TN
Memphis, TN
Louisville Jefferson County, KY
Crestview, FL
Columnbus, GA
Pensacola, FL
Fort Leonard Wood., MO
Colorad(o Springs, (0
Upper Midwest
Greater Texas
Mid-South
Mid-South
Mid-South
Mid-South
Mid-South
Mid-South
Greater Texas
West
2009-2010
Population
Participation (Coef
9,461,105
2,837,592
273,949
I,316.100
1,283566
180822
294,865
448,991
52,274
645,613
.79
0.76
0.76
0.76
0.74
0.73
0.70
0.70
0.69
0.68
South is particularly well represented. So wliile extra-community migrant s are more
likely to move to and from major cities, there are plenty of mid-size metro areas that
receive a high proportion of migrants from multiple c-ommunit ies.
6.2.3
Within-Community Role
Having examined the distrilbution of extra-conmmunity migrants, the obvious next
step is to look at those migrants who (on't leave their coninunities. A ty's intracommunity degree can be calculated by looking at the total nuinber of reciprocal
migrants it exchanges with other cities in its coummnity. To account for variation in
t he tot al number of migrainits in each community, it is useful to examine the percent age
of a community's total migrants that pass through each city. Figure 6-9 imps this
percentage. ilere appears to be a fair anount of concentration here as well: roughly
ten cities stand out as hubs of within-commnunity migration.
To b,etter understand the distribution of within-community migrants, Figure 6-10
plots the top end of the rank distribution for (ich community. Some communities
most notably the Upper Midwest, Greater Texas, and the West--have a few cities that
clearly dominate intra-cormmunity migration. Chicago, Dallas. aInd Los Angeles domnimate their regions, with Minneapolis, Houston, and Riverside also playing important
roles. The East Central and East Coast communities have clear top migration centers in Detroit and New York respectively, although neither plays as strong a role in
its community as do Chicago, Dallas, and LA in theirs. Finally, no cities dominate
migration within the Mid-South. Nashville and Birmingham have the highest percentage of migrants, but their shares are substantially lower than the shares of the
top cities in other communities.
Finally, Figure 6-11 simply plots extra-community degree against intra-comununity
degree. There appear to be two parts to this relationship. A large number of smaller
cities have low extra-community degree: they exchange fewer than 100 migrants xvith
MSAs outside of their own community. Aiong cities that exchange more than 100
55
Figure 6-9: Percentage of Communities' Migrants By Metro
fQi
p
.~
~..
Figure 6-11: Extra Community Degree
vs. Intra Community Degree
Figure 6-10: Rank Distribution of
Within-Community Degree
0.14,
Greater Texas
Upper Midwest
East Central
West
East Coast
Mid-South
oIt
C.
*
15,
I..
age.
0.4
Jl.
002
00
2
4
U
8
10
12
10,
14
MSA Rank
56
10
10
10,
Within-Community Degree
ID'
10,
migrants outside their cominunnity, intra- and extra-cOmlnimifity degree show a fairly
strong positive relationship. Places with more migrants outside their community tend
also t have rm(o)re migrants fron within their community.
57
58
Chapter 7
Discussion, Limitations, and Further
Research
The analysis co(lnducted in this thesis has been primarily descriptive in naIture. After
showing that migration flows in the IUS in 2009-2010 systematically differ from the
prediction1s of the radiation model in the pres(nce of' large long-distance flows. I
identified sole of tIhe most central cities in the migration network by various metrics.
Ech(141 of these sheds light on a (ifferent aspect of the ni gration systenm and may
be useful for a different purpose. Unweighted degree identifies the c(ities with the
wIdest geographic reach-the ability to draw migrants from nany different large and
small cities. Phoenix, the capital of retirement, is the dominant city oy this measure.
Weighted degree centrality measures the total reciprocal migration into and out of
an area, and gives a sense of which metros have the most lenographic churn: Los
Angeles. Nexw York.
AI Riverside. Closeness eentrality appears
to highlight metros
that make up what might be considered the metropolitan core of the country, though
this mar simply be due to its high correlat ion with population. Betveenness centralitv
identifies the cities that link to multiple distinct regions that don't independently
exchange many migrants. High betweenness cities, then, might be thought of a~s the
most nationally cosmopolitan places, where people from many regions meet. Chicago
and Dallas rank highest on this measure.
Finally, I used reciprocal migration flows as the basis for identifying distinct conmunmities. formed of multiple MSAs that exchange many migrants among themselves
<1and fewer with the rest of the counntry. This process identified several very cohesive
regions--most notably the West-anid other areas where borders were a bit. fuzzier.
Most of' the identified regions have one or two cities that account for a large share of
the internal mnigration flows, and also tend to have the most interaction with the rest
of the country.
7.1
Limitations of the Current Research
The scope of this project is relatively liimted, and as such it is unable to fully address
every aspect of US migration palterns and how they interaoct with econonic and social
59
activity. However, two key limitations impact the richness of the picture I have been
able to uncover and the strength of its conclusions.
One major limitation of this study is the amount of information contained in the
IRS migration data, about the migrants themselves. The IRS dataset contains only the
overall numbers of migrants and their gross income. This is sufficient to determine the
flows of people and conduct the centrality analyses. But it skates over the question
of who exactly is noving where, and may obscure more subtle patterns. Are the
people moving into a given city denographically similar to those moving out? What,
fraction of reciprocal migrants are people moving back to their hometowns? Are flows
between distant areas made up of different types of people than ones between nearby
ones? These questions are unanswerable based only on the migration data, but their
answers will dramatically shape the impact of migrants on their new communities.
Previous research suggests that patterns of migration vary greatly with age-young
adults have a marked tendency to move to major metropolitan areas, balanced out
by net outflows among the middle-aged and elderly (Plane et al., 2005). Particularly
relevant for econonic development, the IRS data doesn't contain any information
about the human capital of migrants, so it's difficult, to tell what kinds of migrants
different, places are attracting, and what the economic impact of them is likely to be.
A second limitation is that this thesis is based entirely on migration data for one
year, 2009-2010. The networks and communities explored here are valid for that year
only, and will shift as flows wax and wane. Determining the speed and extent to which
this occurs will be important for finding the utility of the constructs employed here,
especially since previous research has suggested that migration patterns may shift
relatively quickly over time (McHugh & Gober, 1992; Clark, 1982). If the central
nodes and communities within the network are found to fluctuate quickly over time,
it will be difficult to argue that they play a meaningful economic role.
7.2
Further Research
An initial possibility for further work is to extend the analysis conducted in this thesis
to other years. The IRS has released data, for the years 2004-2011, a long enough time
period to establish the speed at which migration patterns are currently shifting and
the stability of communities and central locations over the past decade.
A more ambitious extension would begin to approach the possibility of making
causal claims about the determinants or impacts of migration. Revisiting the radiation model, an investigation could attempt to explain the residuals found there using
data on the economic, cultural, and physical conditions of the sending and receiving MSAs and their relationship. This would to some degree emulate previous work
modeling migration at the regional level, but may be able to expand on that work
by seeking to explain the residual rather than the total flow-that is, why the flow
is greater or less than would be normally expected, instead of explaining the total
amount. It also may be able to expand on previous work by making use of interactions between origin and destination variables, such as whether their economies are
centered on complimentary industries.
60
The comp]--)lement to this wvould be to attempt to model tie imlipact of migration-perhaps total churn, or network cenlrality-on econoimic performcance. This would
present a major identification challenge, but rmight be possible using lagged measures
or sufficient controls.
Finally, if alternative or supplemental sources of data can be found, it, would be
extremely informative to conduct a deeper dive into the demographic conposition of
various migration flows.
61
62
References
Andris, C. (2011). Afetrics and Afehods for Sociai Distance. Unpublished doctoral
dissertat ion, Massachusetts Institute of Technology.
Andris, C., Halverson, S., k. Hardist.y, F. (2011, June). Predicting migration system
dyilamics with conditional and posterior probabilities. Procecdings 2011 IEEE
Iniernational Conkfrence on Spaial Data intinng and Ceoqrpicai Knowlidge
Scrrvics, 192--197.
Black,
D. & Henderson. V. (1999). A theory of urb!)an g(!rowth . Jouirnal
of
political
cconony. 107(2), 252-284.
Blondel, \. & Guillaume, J . (20)8). Fast mfolding of co-mnlnities in large netw)orks.
Jolrnal oj Statistlcal Alchanics: Thory (nd Experinent, 1-2.
Boijas, (. (2006). Native Internal i\igration and the Labor Market Impact of Immigration. Journal of Human Resources(August 2005).
(192. Sept ember). Solf-Selection and jnternal
Borjas. G., Bronars, S. & Trjo,
Migration in the United States. Journal of Urban Econonacs, V2(2), 159--85.
Borondo, J., Boronfdo, F., Rodriguez-Sickert, C., & Hidalgo, C. a. (2011, January).
To each aceording to its degree: the meritocracy and topocracy of embedded
markets. Scientific rcports, 4, 3784.
IKzdi, G.. & Turner. S. (2004, July). Trade in university
Bound, ., Groen, J.,
training: cross-stcate vari ation in the production and stock of college-educated
labor. Journal of Econotclrics, 121 (1-2), 143 173.
Brandes, U., k7 Fleiseher, 1). (2005). Centra lity measures based on current flow, in
Poe. 22nd symp. thcoretida aspcfts of computer sclincc (pp. '33- 544).
Calabrese., F.. Dahhlemn./ D., Gerber, A., Paul, D., Clen, X., Rowland, J., ... Ratti, C.
(2011, October). The (onnected States of America: Quantifying Social Radii
'.
of Influence. 2011 IEEE Third [1 ' Confereca n Pr vacy , Securily. Risk and
Trust and 2011 IEEE Third Int'71 Confe-renc( on Soci'al Computin(], 223--230.
(2008, November).
Chen, Y., &. Rosenthal, 5.S.
migration: Do people move for jobs or fun?
Local amenities and life-cycle
Journal of Urban Econonmics,
64 (3), 519-537.
Clark, G. L. (1982, March). Volatility in the geographical structure of short-run US
interstate migration. Environment C planninq A, 14(2), 145-67.
Courchene, T. (1970). Interprovincial Migration and Economic Adjustment. Canadian Journal of Economics.
Elsner, B., Narciso, G., k Thijssen, J. (2013). Migrant Networks and the Spread of
Misinformation. , 1-49.
63
Florida, R. (2002). The Economic Geography of Talent. Annals of the Association
of American geographers, 92(4), 743-755.
Frey, W. H. (1994). The New White Flight. American Demographics, 16(4), 1-8.
Frey, W. H. (1995). Immigration and Internal Migration "Flight": A California Case
Study. Population and Environment, 16(4), 353-375.
Frey, W. H. (1996). Immigration, Domestic Migration, and Demographic Balkanization in America: New Evidence for the 1990s. Population and Development
Review, 22(4), 741-763.
Frey, W. H. (2009). The Great American Migration Slowdown. Brookings Institution,
Washington, DC. . . . (December), 1-28.
Gamlen, A. (2011). Creating and destroying diasporastrategies (No. April). Oxford.
Glaeser, E., Kolko, J., &- Saiz, A. (2001). Consumer city. Journal of economic
geography, 1, 27-50.
Glaeser, E., & Saiz, A. (2003). The Rise of the Skilled City.
Glaeser, E., Scheinkman, J., & Shleifer, A. (1995). Economic growth in a cross-section
of cities. Journal of Monetary Economics, 36.
Good, B. H., Montjoye, Y.-a. D., & Clauset, A. (2010). The Performance of Modularity Maximization in Practical Contexts. Physical Review E, 81, 1-20.
Gottlieb, P. D. (1995, November). Residential Amenities, Firm Location and Economic Development. Urban Studies, 32(9), 1413-1436.
Gottlieb, P. D., & Fogarty, M. (2003, November). Educational Attainment and
Metropolitan Growth. Economic Development Quarterly, 17(4), 325--336.
Gottlieb, P. D., & Joseph, G. (2006, October). College-To-Work Migration of Technology Graduates and Holders of Doctorates Within the United States. Journal
of Regional Science, 46(4), 627-659.
Granovetter, M. (1973). The Strength of Weak Ties. American journal of sociology,
78(6), 1360-1380.
Granovetter, M. (1985). Economic action and social structure: the problem of embeddedness. American journal of sociology, 91(3), 481-510.
Grant, E., & Vanderkamp, J. (1980). The Effects of Migration on Income: A Micro
Study with Canadian Data 1965-71. The Canadian journal of economics.
Graves, P. (1980). Migration and climate. Journal of regional Science.
Graves, P., & Knapp, T. a. (1988, July). Mobility behavior of the elderly. Journal
of Urban Economics, 24(1), 1-8.
Greenwood, I. J. (1997). Internal migration in developed countries. Handbook of
population and family economics.
Groen, J. a. (2004, July). The effect of college location on migration of collegeeducated labor. Journal of Econometrics, 121(1-2), 125--142.
Gross, E. (1999). US Population Migration Data: Strengths and Limitations (Tech.
Rep.). Internal Revenue Service.
Guimerk, R., Mossa, S., Turtschi, A., & Anaral, L. a. N. (2005, May). The worldwide
air transportation network: Anomalous centrality, community structure, and
cities' global roles. Proceedings of the National Academy of Sciences of the
United States of America, 102(22), 7794-9.
64
Hagerstrand, T. (1966).
Aspects of the Spatial Structure of Social Communication
and the Diffusion of lnfwrnation. Papers in JRegional cience.
Hansen, S.. Ban, (I., &T Huggins, L. (2003). Explaining the AAIJBrain Dr ainiAl
From Older Industrial Cities: the Pittsburgh Region. Ecornomic Development
Quar'terly.
lunt, G. L. (1993, January). Equilibrium and disequilibrium in migration riodelling.
Regional studies, 27(4), 341-9.
Internal Revenue Service. (2011). Supplemental Documentation
Products (Tech. Rep.).
Issermnan, A., Plane, D.. & MeMillen, 1).
for
Migration Pa/a
(1982). Internal Migration in the United
States: An Evaluation of Federal Data. Review of Public Data Use.
Kemeny, T., & Storper, M. (2012, February). the Sources of Urban Development:
Wages, Housing, and Anenity Gaps Across American Cities*. Journal (4 Re!Ilonal Hcience, 52(1), 85-108.
Kennan, J., & Walker, J. R. (2011). The Effect of Expected Income on Individual
Migration Decisions. Econonetrica, 79(1), 211-251.
Kodrzcki, Y. (2001). Migration of recent college graduates: Evidence fiom the
Nat)ional Longitudinal Survey of Youth. New Enland LEonomin i(eJiw.
Kotkin, . (2012).
hlir' A miericans Arc, Moving.
Kuznet sov, Y. (2006). Networks and the International A4iration: How Countries
Can. Draw on Their Talent Abroad.
) I -tilVersCAI
C
I,",,nt
JIur
.CX
(2N2
, CargIIIuO, L ., & -1
_Lenorma114nd,_ A1. Huet,
model of conmuting networks. PloS one, 7(10), e45985.
Lucas, R. (1988). On the Mechanics of Economic 1)evelopment. Journai of monetary
economies, 22(February), 3--42.
lassey, D. (1988). Economic Development and International Migration in Comparative Perspective. Poplaition, and development reiew.
Mathur, V. K. (1999, August). Human Capital-Based Strategy for Regional Economic
Devel(o1pnent. Econliomic Development Quartcrly, 13(3), 203-216.
(2006). Brain Drin in Ohio: ObserMcGuire. P., Hardy-Johnston. D., &;Saevig,L.
to Northwest Ohio.
Reference
vations and 5,umwiaries uith Pariticuha'r
Mcl-ugh, K. E., & Gober, P. (1992). ShorthATerm Dvnaics of the US Interstate
Mligration System, 1980fuA 1988. Growth and (hanqe.
Mollov, R., Smith, C. L., & Wozniak, A. K. (2011). Internal 11iiyration i. the United
States (No. 17307).
Morrison, P. S., & Clark, W. a. V. (2011). Internal migration and employment: macro
flows and micro motives. Environment and Planning A, 43(8), 1948-1964.
Neal, Z. (2010, March). Refining the Air Traffic Approach to City Networks. Urban
Studies, 47(10), 2195--2215.
Newman, -M., & Girvan, M. (2004, February). Finding and evaluating community
structure in networks. Physical Review E, 69(2), 026113.
Office of I\Management and Budget. (2013). Revised Delineations of Metropolitan
Statistical Areas, Micropolitan tatistical Areas, and Combined Statistical Areas,
and Guidance on Uses of the Delineations of These Areas (Tech. Rep. No. 13).
65
Piiparinen, B., & Russell, J. (2013). From. Balkanized Cleveland to Global Clcveland:
A Theory of Change for Legacy Cities (Tech. Rep. No. November).
Plane, D. (1993, January). Demographic influences on migration. Regional studies,
27(4), 375-83.
Plane, I., Henrie, C., & Perry, M. (2005). Migration up and down the urban hierarchy
and across the life course. Proceedings of the National Academy of Sciences,
102(43), 15313-15318.
Plane, D., & Rogerson, P. (1991). Tracking the Baby Boom, the Baby Bust, and
the Echo Generations: How Age Composition Regulates US Migration. The
Professional Geographer.
Pred, A. R. (1971). LargeaARCity Interdependence and the Preelectronic Diffusion
of Innovations in the US. GeographicalAnalysis.
Pred, A. R. (1973). Urban growth and the circulation of information: the United
States system, of cities, 1790-1840. Harvard University Press Cambridge, MA.
Pred, A. R.. (1975). Diffusion, Organizational Spatial Structure, and City-System
Development. Economic Geography, 51 (3), 252-268.
Pred, A. R. (1977). City Systems in Advanced Economies. London: Hutchinson.
Pred, A. R. (1980). Urban Growth and City-Systems in the United States, 1840-1860.
Harvard University Press.
Ratti, C., Sobolevsky, S., Calabrese, F., Andris, C., Reades, J., Martino, M.,
Strogatz, S. H. (2010, January). Redrawing the map of Great Britain from a
network of human interactions. PloS one, 5(12), e14248.
Ravenstein, E. (1885). The Laws of Migration. Journal of the Statistical Society of
London, 48(2), 167-235.
Rogers, A. (1990). Requiem for the Net Migrant. GeographicalAnalysis, 22(4).
Romer, P. (1990). Endogenous Technological Change. Journal of Political Economy,
98(5).
Sanderson, A., &T- Dugoni, B. (2002). Interstate Migration Patterns of Recent Science
and Engineering Doctoral Recip (Vol. 1999; Tech. Rep.).
Sassen, S. (1991). The Global City: New York, London, Tokyo (2nd ed.). Princeton,
NJ: Princeton University Press.
Saxenian, A., & Sabel, C. (2008). Roepke Lecture in Economic Geography Venture
Capital in the hAUPeripherysAI: The New Argonauts, Global Search, and
Local Institution Building. Economic Geography.
Schachter, J., & Althaus, P. (1989). An Equilibrium Model of Gross Migration.
Journal of Regional Science.
Senor, D., & Singer, S. (2009). Start-up Nation: The Story of Israel's Economic
Miracle.
Simini, F., Gonzslez, M. C., Maritan, A., & Barabssi, A.-L. (2012, April). A universal
model for mobility and migration patterns. Nature, 484 (7392), 96-100.
Sjaastad, L.
(1962).
The Costs and Returns of Human Migration.
The journal of
political economy, 70(5), 80-93.
Stephenson, K., & Zelen, M. (1989). Rethinking Centrality: Methods and Examples.
Social Networks, 11.
66
Storper. M. (2010, December). Why do regions develop and change? The challenge for
geography and economics. Journa;o o/ Economic Geoqraphy, 11(2), 333-34G.
Stricker, K. (2007). Hural TRain Drai. Unpublished doctoral dissertation, Loyoha
University ( 1hica go.
Taylor, P. J. (2001. September). Specification of the Worl City Network. Geo!raphical A nalysis, 33(2), 181-194.
Thierriann, C., Theis, F., Grady, D., Brune, R., & Brockmann, D. (2010, January).
The structure of borders in a small world. PloS one., 5(11), e15422.
Trevz, G. I., Rickman, D. S., Hunt, G. L.. & Greenwood, M. J. (1993). The Dvnanics
of US Internal Migration. The Review of Econormics anid Statistics, 75(2), 209214.
Vanderkamp. J. (1971).
Migration flows, their determinants and the effects of return
migration. The Journal of Political Econorny, 79(5), 1012- 1031.
Whisler, R. L., Waldorf, B. S., Mulligan, G. F., V Plane, D. a. (2008, March). Quality
of Life and the Migration of the C ollege-Educated:
Growth and Change, 39(1 ), 58 94.
Wozniak, A.
(2010).
A Life-Course Approach.
Are Colleoe Graduatces More Re sponsive to Disltam Labor
Market Oppli)ortunities ? Joianal of Humant
Wright, B., Ellis, M., & Reibel, M. (1997).
Resources(INlay).
The Linkage between immigration and
Internal Migration in Large Metropolit an Areas in the U.'nited St ates. Econormoi
Geo(ap hy.
Yezer, A., &, Thurston, L. (1976). Migration Patterns and Income Change: Implicat ions
for the Human Capit al Approach to Migration. Southarn Eco nonic
Journal.
67
Download