Beyond City Size: Characterizing and predicting the location of urban amenities

advertisement
Beyond City Size: Characterizing and predicting
the location of urban amenities
by
Elisa Castaner Ensenat
B.S., M.I.T (2014)
Submitted to the Department of Electrical Engineering and
Computer Science
in partial fulfillment of the requirements for the degree of
Masters of Engineering in Electrical Engineering and Computer
Science
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
June 2015
c Massachusetts Institute of Technology 2015. All rights reserved.
β—‹
Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Department of Electrical Engineering and Computer Science
May 8, 2015
Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Cesar A. Hidalgo
Associate Professor
Thesis Supervisor
Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Prof. Albert R. Meyer
Chairman, Maters of Engineering Thesis Committee
2
Beyond City Size: Characterizing and predicting the
location of urban amenities
by
Elisa Castaner Ensenat
Submitted to the Department of Electrical Engineering and Computer Science
on May 8, 2015, in partial fulfillment of the
requirements for the degree of
Masters of Engineering in Electrical Engineering and Computer Science
Abstract
Intercity studies have shown that a city’s characteristics —ranging from infrastructure to crime—scale as a power of its population. These studies, however,
have not been extended to the intra-city scale, leaving open the question of
how urban characteristics are distributed within a city. Here we study the spatial organization of one important urban characteristic: its amenities, such as
restaurants, cafes, and libraries. We use a dataset summarizing the position of
more than 1.2 million amenities disaggregated into 74 distinct categories and
covering 47 U.S. cities to show that: (i) the spatial distribution of amenities
within a city is characterized by dense agglomerations of amenities (which we
call micro-clusters), (ii) that unlike in the intercity case, size is a poor predictor
of the amenities of each type that locate in each micro-cluster, and (iii) that the
number of amenities of each type in a micro-cluster is better predicted using information on the collocation of amenities observed across all micro-clusters than
using the micro-cluster’s size. Finally, we use these findings to create a recommendation algorithm that suggests amenities that are missing in a micro-cluster
and can inform the efforts of developers and planners looking to construct and
regulate the development of new and existing neighborhoods.
Thesis Supervisor: Cesar A. Hidalgo
Title: Associate Professor
3
4
Acknowledgments
I would like to thank my supervisor, Cesar Hidalgo, for the patient guidance, encouragement, and advice he has provided me throughout my time as his student.
I have been extremely lucky to have a supervisor who cared so much about my
work, and who responded to my questions and queries promtly. I would also like
to thank the rest of the Macro Connections group at the MIT Media Lab for the
support and feedback they have given me throughout the year. The completion
of this project wouldn’t have been possible without their help.
5
6
Contents
1 Introduction
1.1
13
Multi-Centers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Data
17
3 Results
19
3.1
From the intercity to the intra-city scale . . . . . . . . . . . . . . 19
3.2
Micro-clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3
Intra-city scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4
Recommender System . . . . . . . . . . . . . . . . . . . . . . . . 27
4 Discussion
33
A Supplementary Material
35
A.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
A.2 Intercity Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
A.3 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
A.3.1 Effective number of amenities . . . . . . . . . . . . . . . . 40
A.3.2 Identifying cluster centers . . . . . . . . . . . . . . . . . . 41
A.3.3 Assigning points to clusters . . . . . . . . . . . . . . . . . 41
A.4 Collocation of amenities . . . . . . . . . . . . . . . . . . . . . . . 42
7
A.5 Predictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
B Cities Administrative Units and Populations
8
51
List of Figures
3-1 Intercity Scaling Relations . . . . . . . . . . . . . . . . . . . . . . 21
3-2 Clustering algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 24
3-3 Intercity Scaling Relations . . . . . . . . . . . . . . . . . . . . . . 26
3-4 City micro-agglomerations . . . . . . . . . . . . . . . . . . . . . . 29
3-5 Prediction of amenities in Boston’s micro-clusters. . . . . . . . . . 32
A-1 Clustering Algorithm: Boston, SF, NY . . . . . . . . . . . . . . . 43
A-2 Amenities correlations matrix . . . . . . . . . . . . . . . . . . . . 44
9
10
List of Tables
A.1 Merged amenity types . . . . . . . . . . . . . . . . . . . . . . . . 36
A.2 Total amenity count and amenity categories . . . . . . . . . . . . 37
A.3 Cities population and amenity count . . . . . . . . . . . . . . . . 38
A.4 Intercity Scaling parameters per amenity type . . . . . . . . . . . 40
A.5 𝑅2 s of intercity and intra-city models . . . . . . . . . . . . . . . . 47
A.6 AIC and BIC values of intercity and intra-city models . . . . . . . 49
B.1 Administrative units included in each city . . . . . . . . . . . . . 65
11
12
Chapter 1
Introduction
During the last decade the empirical study of cities has been characterized by a
strong emphasis on scaling relationships connecting the size of a city —measured
by its population—with attributes ranging from the availability of infrastructure
to the presence of crime [1, 2]. This growing literature has shown that these
scaling relationships hold across cities from different cultures and time periods
[2, 3]. Yet, these intercity relationships teach us little about the way in which
these attributes are spatially distributed within a city. In fact, one could easily
construct a model where attributes follow a random spatial distribution within a
city and that also satisfies the intercity scaling relationships documented in the
literature. In this paper we add to this literature by bringing the quantitative
study of cities to the intra-city scale and by showing the statistical principles
that explain the frequency, composition, and location of amenities within a city.
But why is the intra-city scale important? One the one hand, understanding
the distribution of amenities is important for the planners and developers who
shape cities. Planners need to create urban designs that stimulate the virtuous
social interactions that encourage economic activity, reduce levels of crime, and
lower traffic congestion [4, 5, 6, 7, 8]. Developers, who construct buildings look13
ing for profits, need to create buildings and units that are attractive to residents
and shop owners, and hence, need to understand which types of buildings and
units are better pre-adapted to the uses that a neighborhood might require. On
the other hand, a city’s citizens and visitors can benefit from maps representing
the city at a meso-scale. These meso-scale maps can focus on clusters of amenities instead of individual units, helping uncover the presence of neighborhoods
with an active urban life. Finally, small business owners may also benefit from
a statistical understanding of cities at the intra-city scale, as the empirical laws
describing the location of amenities in neighborhoods can be used to uncover
instances of unsatisfied demand that shop owners can use to identify new business locations (this is information that is now only available to large franchising
operations —such as Starbucks [9]). So a better understanding of cities at the
intra-city scale can benefit both, the planners and developers that shape a city,
and the citizens and visitors who utilize a city’s streets.
Here, we move beyond the intercity scale and studies focused on the size of
cities by looking at data summarizing the precise location of amenities (such
as restaurants, cafes, and libraries) within a city. Our contribution consists on
two parts. First, we introduce a clustering algorithm to show that the spatial
organization of cities is based on hundreds of highly localized micro-clusters of
urban activity, and that the size of these micro-clusters, is a poor predictor of
the number of amenities of a certain type that are present in it. This suggests
that, to recover the predictability of the intercity studies —which we reproduce
with our data—we need to use information on the types of amenities that are
present in each micro-cluster. Our second contribution involves the development
of a simple prediction algorithm that exploits information on the patterns of
collocation of amenities observed across thousands of micro-clusters. We use
this algorithm to identify anomalies in the data —which can represent instances
14
of unsatisfied demand—and use these anomalies to suggest both, new amenities
for each cluster and the clusters that are in the direst need of a specific amenity.
Together these results help extend the study of cities to the intra-city scale, and
also, open new avenues of research that focus on the composition of amenities at
the neighborhood scale.
1.1
Multi-Centers
The idea that highly localized clusters of economic activity characterize the distribution of amenities in a city has a long academic tradition. On the one hand
scholars have conducted empirical studies looking to identify and characterize
micro-clusters (or neighborhood scale agglomerations), and on the other hand,
we have models that have been used to explain why economic and social activity
agglomerates.
On the empirical side people have used employment densities [10], commuting
patterns [11], the floor space used by businesses [10], mobile phone and social
media activity [12, 13, 14], and the spatial collocation of commercial units [15] to
identify micro-clusters of urban activity. On the theoretical side, people have developed supply side and demand side theories to explain agglomeration. Supply
side theories of agglomeration focus on externalities, such as knowledge spillovers
[16], shared capacities [17], and transportation costs [18], to explain the colocation of businesses and/or manufacturing activities. Demand side theories focus
on the ability of agglomerations to attract shared customers. The quintessential
demand side model of agglomeration is Hotelling’s 1929 model, which predicts
that similar businesses would collocate to maximize their catchment area. These
demand side stories, of course, also apply to businesses that are not necessarily
similar, but complementary, such as shoe-stores and clothing stores, explaining
15
also, why businesses that are not closely related —such as car repair shops and
ice-cream stores—tend not to collocate. The rise of novel high-resolution data
sources summarizing the location of urban amenities, however, allows us to explore the collocation of amenities empirically, helping us both, validate these
theories, but also, provide new empirical facts that we could use to test new
theories and models.
16
Chapter 2
Data
We collect data from the Google Places API containing the latitude, longitude,
and type of amenity (i.e. cafe, restaurant, library, etc.), for more than 1.26 million
amenities across 47 US cities (see SM for details). Additionally, we collect data on
the population of each of these cities by identifying all of the administrative units
contained within the area of our amenities data (see SM). For instance, in the
case of Boston, our amenities data includes the areas of Cambridge, Somerville,
and Brookline, so we estimate the population of the larger city of Boston by
summing the populations of these and other administrative units (see SM).
Going forward we use the word city to refer to the naturally occurring urban
agglomerations that people refer to colloquially as ’cities’, and not to the narrowly defined administrative units that exist within them (i.e. we use Boston
to refer to the union of the administrative units of Boston, Cambridge, Brookline, Somerville, Newton, etc., and not to the administrative area controlled by
Boston’s City Hall). We adopt this use of the word city because our data involves
contiguous areas that transcend individual administrative units.
Certainly the data from the Google Places API is not free of biases and limitations. The amenities data registered in the Google Places API focuses on
17
customer facing businesses and places of interests (from hair salons and bakeries to airports and cemeteries). Therefore, the Google Places API data fails
to include information on other forms of economic activity, such as manufacturing or business-to-business activities. Also, the data might have coding issues,
such as having a restaurant registered as a bar. Moreover, businesses that shut
down, either because they went broke or relocated, might not be updated from
Google Maps, and therefore, the data can contain outdated information. Yet,
despite these limitations, the Google Places API is accurate enough to be the
backbone of the world’s most used mapping service (Google Maps) and is used
daily by millions of individuals to find the location of businesses. This makes the
Google Places API data an imperfect, yet attractive dataset to study the spatial
organization of amenities at the intra-city scale.
Finally, we remind the reader that any results derived in this paper should
be interpreted in the narrow context of the data from which these results were
derived. This is data from an online mapping service and for U.S. cities only.
The question of whether the results presented below can be generalized to other
locations, and also, of whether these results hold for other datasets, is beyond
the scope of this paper.
18
Chapter 3
Results
3.1
From the intercity to the intra-city scale
We begin by reproducing the well-known intercity scaling laws of Bettencourt
et al. [19, 20, 21] using our urban amenities data. By reproducing these laws
we validate our data in the context of intercity research before presenting our
intra-city contributions.
Figure 3-1 shows the total number of amenities π‘Œπ‘ in a city as a function of
its population. The total number of amenities in a city (π‘Œπ‘ ) scales sub-linearly
with a city’s population 𝑁𝑐 as: π‘Œπ‘ = π‘Œ0 𝑁𝑐𝛽 , with π‘Œ0 = 2.03 and 𝛽 = 0.68 (Fig.
3-1a, 𝑅2 = 90% p-value β‰ͺ 1x10−5 ) matching Bettencourt et al. scaling laws
[1, 19, 20, 21]. The sub-linearity of this scaling law indicates the presence of
scale economies, since it means that the number of per capita amenities in a city
decreases with a city’s total population.
Next, we explore the exponent of this scaling relationship for amenities of
different types. We find that some amenities, such as museums, religious centers,
and art galleries, scale slowly with a city’s population (roughly as the square root
of a city’s total population (𝛽 ≈ 0.5)). Other amenities, such as restaurants,
19
bakeries, and dentists, scale almost linearly with a city’s population (𝛽 > 0.8)
(For a summary of all exponents, see SM Table A.4). This diversity of scaling
relationships tells us that the composition of amenities in a city changes with
a city’s population. For instance, in a city with a population of only half a
million people we expect to find, on average, 46 restaurants per museum, but
in a city with ten times that population we expect to find almost double that
(76 restaurants per museum). These different ratios are direct expressions of the
difference in scaling exponents characterizing the dependence of restaurants and
museums in a city’s size (Figure 3-1b).
But not all amenities correlate strongly with a city’s population. In fact, the
relationship between the number of amenities and a city’s population is noisy
for many amenities. To distinguish the amenities that correlate strongly with
a city’s population from those that don’t we use the 𝑅2 statistic of the scaling
relationship connecting a city’s population with the number of amenities of each
type (Figure 3-1c). A high 𝑅2 (𝑅2 > 0.5), such as that characterizing the
scaling of restaurants, schools, and shoe stores, means that a city’s population
is a strong predictor of the number of amenities of that type in a city. A low
𝑅2 (𝑅2 < 0.5), such as that characterizing the scaling of museums, embassies,
and universities, means that a city’s population is an incomplete predictor of
the number of amenities of that type in a city. Note that the observed 𝑅2 s, but
not the scaling exponents, will be almost the same if we were to use the total
number of amenities instead of population as a measure of city size, since a city’s
population correlates almost perfectly with that city’s total number of amenities
(Fig. 3-1a 𝑅2 = 90%).
20
Figure 3-1: Intercity Scaling Relations. a Scaling of the total number of
amenities in a city 𝑐 (π‘Œπ‘ ), as a function of a city’s population (𝑁𝑐 ). The total
number of amenities in a city scales as π‘Œπ‘ = π‘Œ0 𝑁𝑐𝛽 with π‘Œ0 = 2.03, 𝛽 = 0.68,
and 𝑅2 = 0.90. Each point represents one of the 47 US cities in our dataset. b
Scaling of the total number of restaurants and museums in a city as a function
of a city’s population. Each point represents the number of restaurants (yellow)
or museums (blue) in a different city. The figure shows that the scaling exponent
of restaurants (𝛽 = 0.81) is larger than the scaling exponent of museums (𝛽 =
0.59) meaning that the number of restaurants per museum increases with a city’s
population. c The scaling exponent (horizontal axis) and the goodness of fit (𝑅2 ,
vertical axis) of the scaling relationship of each amenity type. The horizontal
dashed line separates amenities whose number correlates strongly with a city’s
population (𝑅2 > 0.5) from those characterized by a milder correlation (𝑅2 <
0.5). The vertical dashed line separates amenities that scale with population
faster than the total amount of amenities in a city (𝛽 > 0.68) from those that
scale slower than that.
21
3.2
Micro-clusters
But do these scaling relationships hold at the intra-city scale? To explore the
intra-city scale we first need to divide the city into meaningful intra city units.
To perform this division we introduce a clustering algorithm that splits cities
into micro-clusters, which are spatially localized and bounded agglomerations of
amenities. Then, we study the city at the intra-city scale by using micro-clusters
as our unit of study. As a measure of the size of a micro-cluster we use the total
number of amenities present in it. Switching from cities to micro-clusters as our
unit of study will reveal that the size of a micro-cluster, unlike that of a city, is
a poor predictor of the number of amenities of each type present in it. Yet, as
we will show, we can recover some of the predictability lost when moving to the
intra-city scale by using data on the types of amenities that are present in each
micro-cluster (and controlling for over-fitting by using both Akaike’s and Bayes’
Information Criteria).
We begin the spatial clustering of urban amenities by calculating the effective
number of amenities that are present in each location 𝑖. We define the effective
number of amenities in location 𝑖 (𝐼𝑖 ), as the number of amenities that can
be reached by walking from that location. Formally, the effective number of
amenities in location 𝑖 is the scalar function 𝐼𝑖 :
𝐼𝑖 =
π‘Œπ‘
∑︁
𝑒−𝛾𝑑𝑖𝑗
𝑗=1
where 𝑑𝑖𝑗 is the distance between amenity 𝑖 and amenity 𝑗, 𝛾 is a decay
parameter that discounts amenities based on their distance to location 𝑖, and
π‘Œπ‘ is the total number of amenities in city 𝑐. To interpret the values of 𝐼 it is
useful to note that an amenity at the location where the measurement is taking
place (i.e. with 𝑑𝑖𝑖 = 0) contributes one to the effective number of amenities in
22
that location. An amenity 𝑗 at distance 𝑑𝑖𝑗 = 1/𝑒 —which would imply walking
1/𝑒 kilometers from amenity 𝑖 will contribute only 1/𝑒 to location’s 𝑖 effective
number of amenities (𝐼𝑖 ). We find that our algorithm finds meaningful clusters
when we set 𝛾 = 16, which implies that the contribution of an amenity to the
effective number of amenities of a location roughly halves every 62.5 meters and
becomes negligible at about 500 meters (the short side of the Manhattan block
is 80 meters long).
Figure 3-2 illustrates our clustering algorithm using the city of Boston as an
example. The bottom layer (Fig. 3-2a) is a map of Boston used for spatial reference. The center layer (Fig. 3-2b) shows Boston’s effective number of amenities
(𝐼) for all the locations where an amenity is present. The top layer (Fig. 3-2c)
shows the clusters identified using our algorithm (with different colors).
To identify the amenities belonging to each cluster we begin by identifying
each local peak on the effective number of amenities landscape defined by 𝐼 (Fig.
3-2b) as the center of a potential micro-cluster. We identify these local peaks
by searching for locations that have an effective number of amenities 𝐼 larger
than their 𝑛 nearest neighbors (using a functional heuristic to find the 𝑛 that
works best for each 𝐼—see SM). Then, we assign amenities to a micro-cluster by
using the following greedy algorithm: (i) We initialize clusters by assigning to
each cluster center all amenities that are in close proximity to it (less than 0.5
kms). (ii) We calculate the distance between each unassigned amenity and the
amenities that have been assigned to a cluster. (iii) We assign to a cluster only
the amenity that is closest to an amenity that has already been assigned to a
cluster. And (iv), we recalculate the distance between assigned and unassigned
amenities and repeat step (iii) and (iv) until all amenities have been assigned to
a cluster. An example of the clusters found for the city of Boston is shown in
Figure 3-2c (see SM for more examples).
23
Figure 3-2: Clustering algorithm. a Map of Boston b The number of effective
amenities (𝐼) at each location where an amenity is present in Boston. Peaks
represent locations with a high number of effective amenities and valleys represent
locations with a low number of effective amenities. The black dots represent the
local maxima identified by our clustering algorithm. These points represent the
centers of a micro-cluster (for example, Kendall/MIT or the North End). c
Clusters identified using our clustering algorithm. Each cluster is expressed as
a set of dots of the same color, each dot representing an amenity. The center of
each cluster is marked using a black dot.
24
Overall, we find that the clusters identified using this algorithm correspond
to well-known centers of urban activity. In the case of Boston these clusters
include Harvard Square and Central Square in Cambridge and The North End
and Coolidge Corner in Boston, among others.
We also note that the distribution of the effective number of amenities in
a city is also characterized by some universal properties. Figure 3-3a shows
the distribution of the effective number of amenities (𝐼) for every city in our
dataset while Figure 3-3b shows the same distribution after normalizing the
effective number of amenities in a city by that city’s average (< 𝐼 >=
∑οΈ€
𝐼
𝑖 𝑖
π‘Œπ‘
). For
comparison, we also show the same distributions for an ensemble of cities where
the location of each amenity has been randomized. These randomized cities are
characterized by a narrow distribution for their effective number of amenities,
meaning that these random cities lack the high concentrations of amenities that
indicate the presence of micro-clusters in real cities. More importantly, figure
3-3b shows that once we normalize the effective number of amenities in a city by
that city’s average all cities follow the same lognormal distribution
𝑃(
𝐼𝑖
= π‘₯) = 𝑙𝑛𝑁 (πœ‡, 𝜎)
<𝐼>
with πœ‡ = −0.404 and 𝜎 = 0.89. The existence of a universal distribution for
the effective number of amenities across all cities in our sample means that all
of these cities have an equal number of peaks and valleys of a given magnitude
when the magnitude of these peaks and valleys is measured in units of that city’s
average.
25
Figure 3-3: Intercity Scaling Relations. a The distribution of the effective
number of amenities in each US city. Blue lines show the distribution observed in
our urban amenities data and orange lines show the distribution observed after
randomizing the location of amenities for each city. b The distribution of the
effective number of amenities in each US city normalized by the average effective
number of amenities in that city. Blue lines show the distribution observed in
the cities data and orange lines show the distribution observed in the same cities
but after randomizing the location of amenities
26
3.3
Intra-city scaling
Now that we have identified micro-clusters for all cities in our data we analyze
whether the scaling relationships that hold at the intercity scale also hold at the
scale of micro-clusters (i.e. we test whether the number of amenities of each
type in a cluster scales with the size of that cluster). Figure 3-4a compares the
scaling relationships observed at the intercity scale with the scaling relationships
observed at the intra-city scale for a subset of amenities and two different models
(for all amenities see SM table A.5). In light colors (light blue and vermillion) we
show the accuracy of models predicting the number of amenities of a given type in
a city or a micro-cluster using only information on that city or cluster’s size. The
dark bars (navy and crimson) show the accuracy of a model using information
on the composition of amenities in a city or micro-cluster (which we will explain
later). The comparison between the size based models show that amenities,
such as schools, doctors, and shoe stores, which correlate strongly with the total
number of amenities in a city (average inter-city scaling 𝑅2 > 70%), do not scale
well with the total number of amenities in a micro-cluster (average inter-city
𝑅2 < 18%). This indicates that the scaling laws observed in the intercity scale
fail to hold—for most amenities—at the intra-city scale.
Next, we try to recover some of the predictability lost at the intra-city scale
by introducing a model based on the composition of a micro-cluster—the types
of amenities present in it.
3.4
Recommender System
We begin the construction of the composition-based model by studying the collocation of pairs of amenities across all clusters. Figure 3-3b shows the network of
correlations between pairs of amenities calculated using spearman’s rank correla27
tion across all clusters. We build the skeleton of this network using a Maximum
Spanning Tree algorithm and then add edges between amenities that have a pairwise correlation equal or larger than 0.3 (see SM for the full correlations matrix)
[22]. The network shows that amenities tend to collocate with other amenities
of similar types. For example, car repair shops collocate with car dealers (Spearman’s 𝜌 = 0.45), religious centers collocate with schools (Spearman’s 𝜌 = 0.46),
and nightclubs collocate with bars (Spearman’s 𝜌 = 0.36). Also, the network
shows that amenities sometimes tend to collocate with amenities from different
categories. For instance, clothing stores collocate with restaurants and beauty
salons (respective Spearman’s 𝜌 = 0.52 𝜌 = 0.45). What is more important,
however, is that these patterns of collocation suggests that it is possible to create
a parsimonious model to predict the number of amenities of a type in a cluster
using information on the presence of other amenities in it, since the network
indicates that the presence of a set of amenities in a cluster carries information
about the presence of other amenities.
Finally, we use the collocation of amenities in a cluster to create an algorithm that we can use to predict the number of amenities that should locate in
each micro-cluster and create a recommender system that we can use to identify
micro-clusters where particular amenities are over or under-supplied. To create
this algorithm we need to go beyond pairwise correlations, as the high clustering
of the network of collocations (Fig. 3-4) indicates that the information about
the presence of an amenity in a cluster carried by the presence of other amenities is likely to have some redundancy. Going forward, we go beyond pairwise
correlations by using a forward selection algorithm that iteratively adds types of
amenities to a regression until the contribution of the presence of a new amenity
type to the predictive power of the regression is characterized by a p-value of
more than 0.001 (see SM). In addition, we validate the models resulting from
28
Figure 3-4: Micro-Cluster Composition a Light blue and light red bars,
respectively, correspond to the 𝑅2 of the predictions obtained using the size of
a city (left) and the size of each micro-cluster (right). The dark blue and dark
red bars correspond, respectively, to the 𝑅2 of the predictions obtained using the
composition of cities (left) and the composition of micro-clusters (right). (For
all amenities see SM). b The nodes in the network represent different types of
amenities and the edges connect amenities that are likely to collocate in a microcluster (see SM). The width of the edges connecting a pair of nodes is proportional
to the spearman correlation obtained from the collocation of the two types of
amenities across all micro-clusters. The size of a node is proportional to the
number of times that an amenity is present in our data set. The color of each
node represents the category that the amenity belongs to.
29
this forward selection algorithm by using both Akaike’s Information Criterion
(AIC) and Bayes’s Information Criterion (BIC). By using AIC and BIC we ensure that the models that we obtain are not better than the models using size
simply because they include more variables.
The red bars of Figure 3-4a (vermillion and crimson) compare the 𝑅2 of the
models constructed using the size of micro-clusters with the 𝑅2 of the models
constructed using the composition of micro-clusters. In most cases (66/74 =
89%), the BIC test chooses the regression using the composition of a micro-cluster
over the regression using its size (the exception are airports, aquariums, bus
stations, car rentals, casinos, convenience stores, gas stations, and zoos). Also, we
note that these results are not just statistically significant, but characterized by
strong size effects. On average, for the 66 amenity types in which the composition
model works better, the 𝑅2 of the composition model is twice that of the model
using size only (𝑅2 = 17% on average using size vs. 𝑅2 = 35% on average
using composition), meaning that the increase in predictive power obtained by
considering the composition of amenities in a cluster is not only statistically
significant, but also substantial.
Finally, we use the composition model described above to create a recommender system [22, 24] to suggest amenities that might be missing in an urban
cluster. We predict missing amenities by calculating the difference between the
number of amenities in a cluster predicted by the composition model and the
number of amenities of that type observed in each cluster.
Figure 3-5 compares the number of car parks, hotels, and beauty salons, observed and predicted, for each micro-cluster in Boston. Points above the lines,
such as Harvard Square in car parks (Figure 3-5a), the North End in hotels (Figure 3-5b), and Central Square in Beauty Salons (Figure 3-5c), suggest instances
of unsatisfied demand. Points below the lines such as Boston’s Theatre District
30
in car parks, Coolidge Corner in hotels, and Winthrop in beauty salons, suggest
instances of excess demand. Of course, these suggestions should not be taken
literally. For instance, a decision to build new parking in Harvard square is a
decision that requires considering many aspects of Harvard Square that are not
included in our model, such as the aesthetics of its architecture [25, 26] or the
externalities caused by cars. Nevertheless, this validation shows that our model
automatically captures the under-supply of parking that characterizes Harvard
square (and that is well known to Cambridge residents). Figure 5b, on the other
hand, shows that our model suggests a lack of hotels in the North End, a wellknown tourist spot where only a handful of hotels are present. This could mean
that there is a great potential for new hotels to locate in Boston’s North End, but
once again, this is a decision that would need to incorporate other factors, such
as North End’s famous idiosyncratic architecture and active resident community
[4].
31
Figure 3-5: Prediction of amenities in Boston’s micro-clusters. a Observed vs. predicted number of car parks, b hotels, and c, beauty salons for
each micro-cluster in Boston. Points above the lines represent micro-clusters
where the predicted number of amenities is higher than the observed, suggesting
instances of unsatisfied demand (or missing data). Points below the lines represent micro-clusters where the predicted number of amenities is lower than the
observed, suggesting instances of excess demand.
32
Chapter 4
Discussion
During recent years the quantitative study of cities has focused extensively on
inter-city studies, and in particular, on inter-city scaling laws. These intercity
studies, however, do not tell us much about the spatial distribution of a city’s
characteristics. In this paper we extended this literature to the intra-city scale
by focusing on micro-clusters of urban amenities and by showing that the scaling
laws that hold at the inter-city scale need to be replaced by multivariate statistical models that exploit information on the composition of micro-clusters to
predict the number of amenities of each type that is present in each micro-cluster.
Of course, our results and models are not free of biases and limitations. Beyond the data biases described above, our model is limited by its simplicity, which
bounds the total amount of variance in the presence of amenities that we can
explain. Our statistical model predicts the number of amenities that locate in a
micro-cluster using regressions without interaction terms. This means that the
models could be potentially improved by using more complex functional forms,
but also, by adding to them information that is not expressed in the presence of
amenities, such as the aesthetic appeal of a neighborhood’s architecture [25, 26],
it’s foot traffic as captured by mobile phone data [27], or the centrality of the
33
urban micro-cluster in the context of the city.
Still, the results and methods presented here point to interesting new avenues
of research. For example, time resolved data sources for both amenities and
streetscapes could be used to explore the interaction between the dynamics of
the amenities that locate in a micro-cluster and the types of buildings being
constructed in it. Also, these results could be used to help inform what types
of business permits need to be given out to help balance the micro-clusters of a
city’s neighborhoods. On the computational side, the information uncovered here
could be used to create new meso-scale city maps that can help users understand a
city’s micro-clusters, but also, deliver the recommendations for each micro-cluster
uncovered by our algorithm or similar algorithms. Together, our results, and the
new avenue of research they open, should help stimulate further quantitative
study of the multivariate statistical laws that characterize cities at the intra-city
scale.
34
Appendix A
Supplementary Material
A.1
Data
Amenities Data: We collected data from the Google Places API containing
the latitude, longitude and type (cafe, restaurant, library, etc.) of the urban
amenities located in 47 US cities. The original data set contains 95 different
types of amenities but we merged them into 74 categories by aggregating data
on amenities that fulfill similar functions (Table A.1) and excluding amenities
that are unspecific (such as the "store" category) or for which little data is
available. The amenities we exclude are: taxi stand, campground, store, subway
station, RV park, movie rental, and shopping mall. The resulting amenities are
shown in Table A.2.
Population Data: We collect data on the population of each city from
Wikipedia. Table B.1 in shows all the administrative units in each city overlapping with our amenities data, and their population as indicated in Wikipedia.
To obtain each city’s population we aggregate the population of each of the administrative units that overlap with our amenities data for that city. The final
population of cities and their total number of amenities are shown in Table A.3.
35
Original Amenities
Hindu temple
Mosque
Place of worship
Synagogue
Church
Meal delivery
Meal takeaway
Food
Restaurant
Health
Doctor
Finance
Bank
Roofing Contractor
Electrician
Plumber
Painter
General Contractor
New Amenities
Religious center
Restaurant
Doctor
Finance
Construction contractor
Table A.1: The left column shows the amenities that were merged into a new
amenity type, shown in the right column.
Amenity
Accounting
Airport
Points
17280
1535
Category
Services
Transportation
Amusement
park
Aquarium
Art gallery
1017
Entertainment
492
5358
Entertainment
Entertainment
ATM
30753
Services
Bakery
Bar
Beauty salon
Bicycle store
Book store
9255
21506
41851
1409
3417
Food & Drinks
Food & Drinks
Services
Shopping
Shopping
Amenity
Gym
Hardware
store
Home goods
store
Hospital
Hotel
and
lodging
Insurance
agency
Jewelry store
Laundry
Lawyer
Library
Liquor store
36
Points
5934
4595
Category
Health
Shopping
29537
Shopping
7942
11452
Health
Services
27866
Services
6751
14391
37611
3466
7948
Shopping
Services
Services
Education
Shopping
Bowling
alley
366
Entertainment
Government
Services
Services
Services
Entertainment
Other
Government
Shopping
Local Gov- 10081
ernment
Office
Locksmith
2182
Movie The- 1232
ater
Moving
12744
Company
Museum
2161
Night Club
5675
Park
25723
Parking
5527
Pet Store
2270
Pharmacy
15204
Physiotherapist 7929
Bus station
Cafe
110642
9485
Transportation
Food & Drinks
Car dealer
11603
Services
Car rental
Car repair
Car wash
Casino
Cemetery
City hall
Clothing
store
Construction
contractor
Convenience
store
Courthouse
2968
40215
3202
172
2386
140
29806
86044
Services
Police
1613
Government
13818
Shopping
Post Office
2723
Services
717
Government
39484
Services
Dentist
26071
Health
58468
Other
Department
store
Doctor
Electronics
store
Embassy
Finance
Fire station
Florist
3515
Shopping
Real Estate
Agency
Religious
Centers
Restaurant
112430
Food & Drinks
153772
11876
Health
Shopping
School
Shoe Store
46516
8612
Education
Shopping
688
32221
2050
5102
Government
Services
Government
Shopping
2843
1245
5849
1262
Health
Entertainment
Services
Transportation
Funeral
home
Furniture
store
Gas station
2761
Services
7394
Services
12379
Shopping
Spa
Stadium
Storage
Train
Station
Travel
Agency
University
6597
Education
2552
Services
5373
Services
Grocery or
supermarket
15206
Shopping
Veterinary
Care
Zoo
114
Entertainment
Total
1,262,374
Services
Entertainment
Services
Entertainment
Food & Drinks
Other
Transportation
Shopping
Shopping
Health
Table A.2: Total number amenities of each type in the Google Places data set
in the 47 US cities in our study. The Categories column shows the category we
assign each amenity type to when we study the collocation of amenities.
37
City
Atlanta
Austin
Baltimore
Birmingham
Boston
Buffalo
Charlotte
Chicago
Cincinnati
Cleveland
Columbus
Dallas
Denver
Detroit
Houston
Indianapolis
Jacksonville
Las Vegas
Los Angeles
Louisville
Memphis
Miami
Milwaukee
Naples
Population Number
of Amenities
447,841
19,050
885,400
22,592
642,587
14,434
389,250
15,066
1,121,438 19,769
258,959
7,409
850,880
19,954
3,618,465 64,531
453,968
13,818
685,931
18,496
1,128,075 27,854
2,435,949 44,358
1,757,830 32,731
973,284
21,776
3,362,560 80,011
1,468,843 212,96
1,007,094 204,66
1,850,966 29,009
6,428,879 114,002
840,601
22,425
832,803
21,350
800,216
13,403
822,777
20,590
95,796
5,970
City
Nashville
New Orleans
New York
Oklahoma
Orlando
Philadelphia
Phoenix
Pittsburgh
Portland
Providence
Raleigh
Richmond
Sacramento
Salt Lake
San Antonio
San Diego
San Francisco
San Jose
Seattle
St Louis
Tampa
Virginia Beach
Washington
Total
Population Number
of Amenities
737,796
21,619
570,943
14,607
8,405,837 75,081
922,506
21,010
493,524
20,559
1,945,795 40,410
2,046,991 39,354
466,879
15,714
609,456
21,043
290,459
6,653
582,834
15,884
262,944
9,437
767,408
20,372
210,806
9,444
1,511,307 35,255
2,297,970 46,614
837,442
18,984
1,472,951 30,868
622,155
20,514
361,273
12,125
742,583
25,285
448,479
10,619
1,267,943 20,310
60,940,877 1,236,151
Table A.3: Population and total number of amenities of each city.
A.2
Intercity Scaling
We explore the scaling exponent 𝛽 of the scaling relationship (π‘Œπ‘π‘˜ = π‘Œ0 𝑁𝑐𝛽 ) for
each type of amenity π‘˜, π‘Œπ‘π‘˜ , in a city 𝑐 with population of that city, 𝑁𝑐 , finding
that scaling exponents vary greatly for each amenity type. Table A.4 shows π‘Œ0 ,
𝛽, and 𝑅2 of the scaling relationship of each type of amenity.
38
Amenity
Accounting
Airport
Amusement
park
Aquarium
Art gallery
π‘Œ0
0.014
0.003
0.000
𝛽
0.727
0.655
0.769
𝑅2
0.751
0.362
0.309
π‘Œ0
0.002
0.009
0.028
𝛽
0.797
0.658
0.710
𝑅2
0.869
0.783
0.709
0.006
0.003
0.725
0.797
0.837
0.712
0.013
0.763
0.582
0.003
0.003
0.536
0.005
0.001
0.094
0.766
0.814
0.520
0.681
0.897
0.553
0.833
0.775
0.627
0.755
0.775
0.780
0.000
0.001
0.009
0.861
0.704
0.734
0.526
0.776
0.544
0.681
0.613
0.601
0.035
0.124
0.285
0.846
0.620
Amenity
Gym
Hardware store
Home
goods
store
Hospital
Hotel and lodging
Insurance
agency
Jewelry store
Laundry
Lawyer
Library
Liquor store
Local Government Office
Locksmith
Movie Theater
Moving Company
Museum
Night Club
Park
Parking
Pet Store
Pharmacy
Physiotherapist
Police
0.000
0.010
0.765
0.653
0.621
0.742
Atm
0.041
0.690
0.695
Bakery
Bar
Beauty salon
Bicycle store
Book store
Bowling alley
0.001
0.016
0.025
0.001
0.002
0.002
0.847
0.728
0.745
0.733
0.735
0.589
0.936
0.799
0.816
0.699
0.829
0.421
Bus station
cafe
Car dealer
18.313
0.004
0.015
0.320
0.767
0.684
0.250
0.756
0.420
Car rental
Car repair
Car wash
Casino
Cemetery
City hall
Clothing store
Construction
contractor
Convenience
store
Courthouse
0.000
0.024
0.001
0.004
0.006
0.004
0.012
0.179
0.839
0.743
0.790
0.461
0.634
0.465
0.772
0.656
0.010
0.003
0.014
0.010
0.000
0.007
0.008
0.001
0.592
0.762
0.751
0.667
0.891
0.763
0.706
0.737
0.692
0.852
0.736
0.765
0.890
0.908
0.830
0.744
0.054
0.611
0.462
Post Office
0.002
0.736
0.888
0.001
0.710
0.717
0.041
0.704
0.710
Dentist
0.002
0.902
0.823
0.156
0.639
0.748
Department
store
Doctor
Electronics
store
Embassy
Finance
Fire station
Florist
Funeral home
0.005
0.680
0.519
Real
Estate
Agency
Religious Centers
Restaurant
0.024
0.817
0.945
0.205
0.003
0.689
0.803
0.818
0.790
School
Shoe Store
0.015
0.002
0.790
0.827
0.925
0.887
0.000
0.024
0.011
0.002
0.018
0.838
0.729
0.583
0.766
0.569
0.153
0.798
0.371
0.843
0.564
Spa
Stadium
Storage
Train Station
Travel Agency
0.000
0.003
0.003
0.005
0.001
0.853
0.643
0.747
0.558
0.856
0.687
0.460
0.454
0.133
0.857
39
Furniture store
Gas station
Grocery or supermarket
0.011
0.005
0.006
0.716
0.652
0.766
0.796
0.289
0.878
University
Veterinary Care
Zoo
0.149
0.002
0.009
0.482
0.764
0.398
0.225
0.648
0.364
Table A.4: Shows the value of the parameters π‘Œ0 , 𝛽 and 𝑅2 of the scaling relationship, of the total number of each type of amenity in a city, π΄π‘˜π‘ , with that
city’s population, 𝑁𝑐 expressed as: π΄π‘˜π‘ = π‘Œ0 𝑁𝑐𝛽 .
A.3
A.3.1
Clustering
Effective number of amenities
We begin our clustering procedure by calculating the effective number of amenities at each location. The effective number of amenities, 𝐼𝑖 , in a location 𝑖
represents the number of amenities that can be reached by walking from that
location. We define 𝐼𝑖 as:
𝐼𝑖 =
π‘Œπ‘
∑︁
𝑗=1
𝑒−𝛾𝑑𝑖𝑗 =
π‘˜
∑︁
𝑗=1
π‘Œπ‘
∑︁
𝑒−𝛾𝑑𝑖𝑗 +
𝑗=π‘˜+1
𝑒−𝛾𝑑𝑖𝑗 =
π‘˜
∑︁
𝑒−𝛾𝑑𝑖𝑗 + πœ–
𝑗=1
where 𝑑𝑖𝑗 is the distance (in km) between amenity 𝑖 and amenity 𝑗, and π‘Œπ‘ is
the total number of amenities in a city 𝑐. 𝛾 is a decay parameter that discounts
amenities based on their distance to location 𝑖. We set 𝛾 = 16, meaning that
the contribution of an amenity to the effective number of amenities at a location
roughly halves every 62.5 meters and becomes negligible at about 500 meters.
To simplify the calculation of the effective number of amenities in a location we
use π‘˜ amenities instead of π‘Œπ‘ . Theoretically all of the amenities in a city should
contribute to a location’s effective number of amenities, but since amenities that
are far from a location are discounted by an exponential factor, considering
40
the contribution of the π‘˜ closest amenities gives already a good approximation.
In general, we find that the effective number of amenities for a location does
not change after considering the first few hundred amenities, indicating that
π‘˜ = 2, 000 provides a set that is large enough to provide a good estimate for a
location’s effective number of amenities.
A.3.2
Identifying cluster centers
We continue our clustering procedure by identifying the centers of each microcluster as the local peaks on the landscape. We identify local peaks by searching
for locations that have an effective number of amenities, 𝐼𝑖 , larger than their 𝑛𝑖
nearest neighbors. We define 𝑛𝑖 as: 𝑛𝑖 = 3𝐼𝑖 + 50, i.e. a function of the effective
number of amenities at location 𝑖, so that the centers of very dense clusters
are required to have larger 𝐼𝑖 than a large number of neighbor amenities, while
centers of very sparse clusters are required to have larger 𝐼𝑖 than a small number
of neighboring amenities. By setting 𝑛𝑖 proportional to 𝐼𝑖 we avoid assigning
multiple cluster centers to areas with high density of amenities, and we avoid
not assigning any cluster center to areas with a low density of amenities.
A.3.3
Assigning points to clusters
Finally, we assign points to micro-clusters using the cluster centers we obtained.
First, we remove the 10% of the points in each city with the lowest effective
number of amenities, to eliminate isolated amenities that are not part of a microcluster. After that, we assign all amenities that are within a distance of 0.5km
of a cluster center to that cluster center. Then, we calculate the distance from
each unassigned point to each assigned point. Furthermore, we iteratively:
1. Choose the unassigned point, 𝑒, which is closest to an assigned point, π‘Ž.
41
2. Assign point 𝑒 to the cluster point a belongs to.
3. Calculate the distance from each unassigned point to the newly assigned
point 𝑒.
The algorithm finalizes once all points have been assigned to a cluster. Figure A1 shows the effective number of amenities in the cities of Boston, San Francisco,
and New York (left figures), and the corresponding assignments of amenities to
clusters (right figures).
A.4
Collocation of amenities
To study the collocation patterns of amenities, we calculate the spearman correlation between all pairs of amenities across clusters. We show the resulting
correlations in the form of a network, where nodes represent amenity types and
edges connect amenities that are highly correlated across micro-clusters. To
construct this network we first create a Maximum Spanning Tree (MST) of the
network and then add edges only between amenities that have a pairwise correlation equal or larger than 0.3.
Here, we show the values of all spearman correlations between amenities
across clusters in the form of a matrix (Figure A-2). We cluster amenities using
Ward linkages.
A.5
Predictions
We construct four regression models to predict each type of amenity in the intercity and intra-city scale using two different metrics: size and composition. In
the inter city scale, we predict the number of each type of amenity in a city using
the total number of amenities in a city and the composition of amenities in the
42
Figure A-1: Clustering Algorithm: Boston, SF, NY. The figures on the
right show the effective number of amenities at each location in the cities of a
Boston, b San Francisco, and c New York. Red lines correspond to areas with a
high effective number of amenities and blue lines correspond to areas with a low
effective number of amenities. The black dots represent the locations we assign
as cluster centers. The figures on the left show the corresponding assignment of
amenities to micro-clusters. Each dot represents an amenity, and sets of dots of
the same color constitute a micro-cluster.
43
Figure A-2: Amenities correlations matrix. Matrix showing the Spearman
correlation between each pair of amenities. Amenities are clustered using Ward
linkages.
44
city. In the intra-city scale, we predict the number of each type of amenity in a
micro-cluster using the size of micro-clusters and the composition of amenities
in each micro-cluster. We create a model that uses the total number of amenities in a micro-cluster to predict the number of each type of amenity in that
micro-cluster. To construct these models we use a forward selection algorithm
that iteratively adds types of amenities to a regression until the contribution of
the presence of a new amenity type to the predictive power of the regression is
characterized by a p-value of more than 0.001 (nextly we explain how we use
AIC and BIC to verify our model selection). Table A.5 shows the 𝑅2 obtained
for each of these models.
Given that these four models use a different number of samples and parameters, we calculate the Akaike Information Cirterion (AIC) and Bayesian
Information Criterion (BIC) of each of the models. These criteria allow us to
differentiate the models: the lower the AIC and BIC values, the more desirable
the model (better fit and less overfitted). The AIC and BIC values obtained for
each model are summarized in Table A.6.
Accounting
Airport
Amusement Park
Aquarium
Art Gallery
ATM
Bakery
Bar
Beauty Salon
Bicycle Store
Book Store
Bowling Alley
Bus Station
Cafe
Car Dealer
Car Rental
Intercity Scaling
Size
Composition
0.946
0.985
0.575
0.816
0.382
0.724
0.709
0.880
0.603
0.930
0.911
0.967
0.777
0.980
0.649
0.966
0.952
0.989
0.594
0.919
0.878
0.980
0.478
0.702
0.242
0.431
0.649
0.956
0.608
0.850
0.831
0.942
45
Intra-City Scaling
Size
Composition
0.291
0.448
0.016
0.114
0.002
0.005
0.014
0.028
0.114
0.271
0.320
0.465
0.364
0.543
0.462
0.750
0.449
0.615
0.080
0.183
0.245
0.344
0.004
0.014
0.023
0.237
0.505
0.670
0.003
0.231
0.042
0.118
Car Repair
Car Wash
Casino
Cemetery
City Hall
Clothing Store
Construction Contractor
Convenience Store
Courthouse
Dentist
Department Store
Doctor
Electronics Store
Embassy
Finance
Fire Station
Florist
Funeral Home
Furniture Store
Gas Station
Grocery or Supermarket
Gym
Hardware Store
Home Goods Store
Hospital
Hotel and Lodging
Insurance Agency
Jewelry Store
Laundry
Lawyer
Library
Liquor Store
Local Government Office
Locksmith
Movie Theater
Moving Company
Museum
Night Club
Park
Parking
Pet Store
Pharmacy
Physiotherapist
Police
Post Office
Real Estate Agency
0.867
0.828
0.016
0.126
0.379
0.884
0.824
0.629
0.676
0.954
0.673
0.957
0.924
0.102
0.953
0.490
0.889
0.476
0.912
0.443
0.791
0.911
0.896
0.908
0.958
0.795
0.825
0.902
0.933
0.871
0.610
0.753
0.901
0.671
0.780
0.721
0.499
0.735
0.669
0.666
0.812
0.878
0.863
0.681
0.859
0.835
0.976
0.970
0.000
0.585
0.449
0.993
0.978
0.928
0.738
0.974
0.945
0.986
0.966
0.419
0.983
0.632
0.981
0.787
0.980
0.777
0.955
0.984
0.953
0.986
0.979
0.824
0.981
0.978
0.984
0.894
0.937
0.815
0.937
0.752
0.952
0.931
0.951
0.957
0.745
0.938
0.943
0.949
0.931
0.866
0.964
0.952
46
0.016
0.005
0.002
0.001
0.031
0.298
0.135
0.042
0.088
0.262
0.016
0.408
0.224
0.046
0.424
0.018
0.207
0.018
0.173
0.000
0.116
0.229
0.020
0.213
0.096
0.250
0.234
0.208
0.180
0.359
0.180
0.175
0.181
0.033
0.125
0.012
0.221
0.326
0.149
0.374
0.077
0.169
0.081
0.052
0.090
0.381
0.437
0.071
0.008
0.015
0.151
0.718
0.456
0.134
0.446
0.439
0.200
0.694
0.355
0.114
0.610
0.058
0.259
0.146
0.444
0.028
0.377
0.339
0.194
0.517
0.546
0.435
0.433
0.352
0.354
0.570
0.416
0.301
0.567
0.053
0.190
0.131
0.412
0.606
0.320
0.610
0.192
0.371
0.260
0.201
0.130
0.513
Religious Centers
Restaurant
School
Shoe Store
Spa
Stadium
Storage
Train Station
Travel Agency
University
Veterinary Care
Zoo
0.744
0.921
0.948
0.916
0.784
0.613
0.632
0.099
0.813
0.238
0.814
0.343
0.868
0.995
0.976
0.966
0.940
0.749
0.912
0.414
0.931
0.351
0.966
0.680
0.171
0.659
0.251
0.153
0.182
0.010
0.010
0.047
0.292
0.020
0.020
0.001
0.430
0.826
0.438
0.648
0.297
0.107
0.123
0.087
0.402
0.328
0.115
0.011
Table A.5: 𝑅2 of the intercity and intra-city models we construct using metrics
of size and composition of cities (in the case of the intercity) and micro-clusters
(in the case of the intra-city).
Accounting
Airport
Amusement
Park
Aquarium
Art Gallery
Atm
Bakery
Bar
Beauty Salon
Bicycle Store
Book Store
Bowling Alley
Bus Station
Cafe
Car Dealer
Car Rental
Car Repair
Car Wash
Casino
Cemetery
City Hall
Clothing Store
Intercity Scale
Size
AIC
BIC
387.1 389.0
283.6 285.4
252.4 254.3
Comp.
AIC
610.2
534.2
492.8
160.8
404.3
458.2
437.6
508.4
471.4
268.8
278.1
132.9
667.5
443.1
461.3
290.1
521.0
293.8
216.3
341.0
76.2
500.4
395.3
600.4
614.1
479.8
724.1
625.0
283.5
510.0
414.8
735.0
461.9
714.4
443.9
731.4
460.0
443.5
586.3
293.2
592.9
162.6
406.2
460.1
439.5
510.2
473.3
270.6
280.0
134.8
669.3
445.0
463.1
291.9
522.8
295.6
218.2
342.9
78.0
502.2
Scale
BIC
615.7
536.0
496.5
Intra-City
Size
AIC
7233.6
-14564.4
-6923.4
BIC
7240.7
-14557.4
-6916.4
Comp.
AIC
5467.9
-14565.1
-6940.3
BIC
5630.0
-14480.5
-6933.2
399.0
605.9
617.8
483.5
731.5
628.7
285.4
517.4
418.5
736.9
465.6
716.2
449.4
738.8
467.4
443.5
590.0
295.0
600.3
-24141.2
15448.2
14260.7
2507.6
20416.0
19820.0
-16203.0
-7584.9
-28639.3
34335.6
4533.6
10190.0
-3146.7
22908.0
-11654.7
-35421.5
-21285.1
-37437.7
31184.7
-24134.2
15455.3
14267.7
2514.7
20423.0
19827.0
-16196.0
-7577.8
-28632.3
34342.7
4540.6
10197.1
-3139.7
22915.1
-11647.6
-35414.5
-21278.0
-37430.7
31191.8
-23744.2
13780.7
12446.0
-349.5
14125.5
16895.3
-17083.1
-8461.1
-28784.8
34766.9
1134.4
8802.5
-3181.3
18230.6
-11747.7
-35127.2
-21423.3
-38618.9
23911.6
-23709.0
13893.4
12664.5
-208.5
14358.1
17113.8
-16970.3
-8313.1
-28756.6
34936.1
1338.8
8873.0
-3110.8
18371.6
-11663.2
-35113.1
-21402.2
-38562.5
24024.4
47
Construction
Contractor
Convenience
Store
Courthouse
Dentist
Department
Store
Doctor
Electronics
Store
Embassy
Finance
Fire Station
Florist
Funeral Home
Furniture
Store
Gas Station
Grocery or Supermarket
Gym
Hardware
Store
Home Goods
Store
Hospital
Hotel
and
Lodging
Insurance
Agency
Jewelry Store
Laundry
Lawyer
Library
Liquor Store
Local Government Office
Locksmith
Movie Theater
Moving Company
Museum
Night Club
Park
Parking
591.7
593.6
574.4
578.1
23556.0
23563.1
20067.1
20243.3
455.2
457.1
570.9
572.8
1726.9
1733.9
2246.0
2394.0
160.9
436.7
327.4
162.8
438.6
329.2
404.9
714.8
510.3
406.8
718.5
514.0
-13810.3
19519.7
-6448.4
-13803.3
19526.7
-6441.4
-17901.4
17163.9
-7219.3
-17788.6
17311.9
-7064.3
578.7
386.1
580.5
388.0
607.2
587.0
614.6
590.7
42180.3
3688.0
42187.3
3695.0
36517.8
2189.0
36672.9
2322.9
329.3
440.6
293.9
332.2
336.3
389.5
331.1
442.5
295.8
334.1
338.1
391.4
632.8
615.9
600.6
524.5
568.3
441.7
634.7
623.3
602.4
530.0
570.1
447.2
-2578.3
20231.0
-17404.5
-4705.5
-10028.7
10673.1
-2571.2
20238.1
-17397.5
-4698.5
-10021.7
10680.1
-3268.8
16860.1
-17771.8
-5303.1
-10844.2
7599.0
-3205.4
17036.3
-17722.5
-5197.4
-10703.3
7697.7
333.2
467.7
335.0
469.6
543.6
592.6
545.5
596.3
-13926.6
9280.9
-13919.5
9288.0
-11923.4
6500.8
-11867.0
6677.1
321.2
299.0
323.1
300.9
463.7
516.4
469.2
522.0
-2721.9
-8239.8
-2714.9
-8232.8
-4013.8
-9960.1
-3837.6
-9854.4
469.2
471.0
645.5
651.0
17169.9
17177.0
13170.9
13290.7
310.3
413.6
312.1
415.5
428.4
734.4
433.9
736.3
11386.1
12585.9
11393.2
12592.9
5907.4
10292.7
6027.2
10483.0
496.8
498.7
692.3
697.9
14861.7
14868.7
12397.0
12538.0
353.3
400.6
479.9
343.4
405.4
331.1
355.1
402.4
481.8
345.3
407.2
332.9
444.2
566.7
728.9
429.1
632.6
481.5
447.9
570.4
730.8
432.8
634.4
487.0
14860.2
4144.1
38846.6
-5993.1
-1736.3
12849.6
14867.3
4151.2
38853.6
-5986.1
-1729.2
12856.6
13143.0
2146.9
35662.3
-8949.8
-2355.6
7505.0
13269.9
2316.0
35831.4
-8808.9
-2242.8
7638.9
309.2
223.5
443.6
311.0
225.4
445.5
542.0
352.4
628.4
543.9
356.1
632.1
-14495.9
-16822.2
300.1
-14488.8
-16815.1
307.2
-14640.9
-17422.3
-457.0
-14591.5
-17337.7
-372.4
318.3
377.0
504.8
372.0
320.1
378.9
506.7
373.9
385.0
545.3
683.4
596.1
392.4
550.8
685.3
599.8
-4793.4
6321.7
11027.2
5373.6
-4786.3
6328.7
11034.2
5380.6
-6985.0
1774.4
10194.1
1963.1
-6872.2
1922.5
10363.3
2153.4
48
Pet Store
Pharmacy
Physiotherapist
Police
Post Office
Real
Estate
Agency
Religious Centers
Restaurant
School
Shoe Store
Spa
Stadium
Storage
Train Station
Travel Agency
University
Veterinary
Care
Zoo
285.8
429.4
353.1
252.8
269.4
515.7
287.6
431.2
355.0
254.6
271.3
517.6
452.0
643.7
667.6
349.2
504.4
668.7
455.7
647.4
673.1
352.9
508.1
672.4
-13001.7
7366.7
791.0
-15255.4
-12492.7
20820.6
-12994.6
7373.7
798.0
-15248.4
-12485.7
20827.7
-14005.4
5035.9
-544.4
-16701.6
-12755.2
19273.1
-13885.6
5176.9
-459.9
-16602.9
-12670.6
19449.4
565.0
566.9
729.1
732.8
24793.2
24800.3
21728.3
21883.4
602.3
488.4
363.4
307.5
225.0
394.6
334.7
398.7
403.4
336.8
604.2
490.3
365.2
309.4
226.8
396.4
336.6
400.6
405.3
338.6
745.4
741.3
607.3
456.4
553.0
582.3
545.1
545.0
557.5
619.2
752.8
745.0
611.0
460.1
554.9
586.0
547.0
548.7
559.3
624.8
32182.1
15330.2
18001.5
-8683.6
-13695.6
-6999.5
-10105.8
3926.0
24047.2
-3348.5
32189.1
15337.2
18008.6
-8676.6
-13688.5
-6992.4
-10098.8
3933.1
24054.3
-3341.5
26651.6
13283.1
11416.1
-9852.6
-13931.8
-7697.7
-10424.3
2523.5
21500.1
-3679.7
26912.4
13445.2
11528.9
-9732.7
-13875.4
-7606.1
-10389.0
2671.5
21627.0
-3538.8
70.8
72.7
144.4
148.1
-43402.7
-43395.6
-42907.2
-42879.0
Table A.6: AIC and BIC values of the intercity and intra-city models we construct
using metrics of size and composition of cities (in the case of the intercity) and
micro-clusters (in the case of the intra-city).
49
50
Appendix B
Cities Administrative Units and
Populations
City
Administrative District
Population
Atlanta
Atlanta
447,841
Total City
Population
447,841
Austin
Austin
885,400
Baltimore
Baltimore
Arbutus
Halethorpe
622,104
20,483
N/A
Birmingham
Birmingham
Vestavia Hills
Mountain Brook
Homewood
Bessemer
Fultondale
Gardendale
Tarrant
Center Point
Chalkville
Trussville
212,237
34,018
20,359
25,750
27,053
8,752
13,735
6,285
16,864
3,829
20,368
Boston
Boston
Quincy
Milton
645,966
92,271
27,003
885,400
642,587
389,250
51
Dedham
Brookline
Somerville
Cambridge
Watertown
Chelsea
Belmont
24,729
58,732
75,754
105,162
31,915
35,177
24,729
Buffalo
Buffalo
258,959
Charlotte
Charlotte
Mint Hill
Matthews
Pineville
792,862
23,341
27,198
7,479
Chicago
Chicago
Lincolnwood
Park Ridge
Rosemont
Schiller Park
Norridge
Hardwood Heights
Bensenville
Franklin Park
River Groove
Elmwood Park
Northlake
Stone Park
Melrose Park
River Forest
Oak Park
Maywood
Bellwood
Berkeley
Hillside
Forest Park
Broadview
Westchester
North Riverside
Berwyn
Cicero
La Grange Park
Riverside
Brookfield
Lyons
Stickney
2,718,782
12,590
37,480
4,202
11,793
14,572
8,612
18,352
18,333
10,227
24,883
12,323
4,946
25,411
11,172
51,878
24,090
19,071
5,209
8,193
14,167
7,932
16,718
6,672
56,800
84,103
13,579
8,875
18,978
10,729
6,786
1,121,438
258,959
850,880
52
Forest View
La Grange
Western Springs
Hinsdale
Mc Cook
Summit
Countryside
Indian Head Park
Hodgkins
Burr Ridge
Palos Park
Palos Heights
Crestwood
Willow Springs
Justice
Bedford Park
Bridgeview
Hickory Hills
Palos Hills
Chicago Ridge
Worth
Hometown
Oak Lawn
Evergreen Park
Alsip
Merrionette Park
Robbins
Blue Island
698
15,550
12,975
16,816
228
11,054
5,895
3,809
1,897
10,559
4,847
12,515
10,950
5,524
12,926
580
16,446
14,049
17,484
14,305
10,789
4,349
56,690
19,852
19,277
1,900
5,337
23,706
Cincinnati
Delhi
Covedale
Mack
Bridgetown North
Dent
Cheviot
Monfort Heights
White Oak
North College Hill
Groesbeck
Finneytown
Amberley
Deer Park
Kenwood
Fairfax Mariemont
296,943
29,510
6,447
11,585
12,569
10,497
8,375
11,948
19,167
9,397
6,788
12,741
3,585
5,736
6,981
1,699
3,618,465
Cincinati
453,968
53
Cleveland
Cleveland
Cleveland Heights
University Heights
Shaker Heights
Maple Heights
Garfield Heights
Parma
Brook Park
Brooklyn
Rooky River
Fairview Park
396,815
46,121
13,539
28,448
23,138
28,849
81,601
19,212
11,169
20,213
16,826
685,931
Columbus
Columbus
Westerville
Huber Ridge
Worthington
Dublin
Hilliard
Upper Arlington
Marble Cliff
Grandview Heights
Lincoln Village
Urbancrest
Grove City
Obetz
Groveport
Blacklick Estates
Reynoldsburg
Bexley
Whitehall
Gahanna
New Albany
787,033
36,120
4,883
13,575
41,751
28,435
33,771
573
6,536
9,482
960
36,832
4,628
5,540
9,518
36,347
13,057
18,062
33,248
7,724
Dallas
Dallas
Richardson
Garland
Farmers Branch
Carrollton
Irving
University Park
Highland Park
Grand Prairie
Duncanville
Hutchins
Seagoville
Balch Springs
1,197,816
103,297
226,876
28,616
126,700
228,653
23,068
8,564
175,396
38,524
5,338
14,835
23,728
1,128,075
54
Mesquite
Sunnyvale
Rowlett
Sachse
Addison
139,824
5,130
56,199
20,329
13,056
Denver
Denver
Glendale
Englewood
Sheridian
Cherry Hills village
Greenwood Village
Littleton
Lakewood
Edgewater
Wheat Ridge
Arvada
Berkley
Twin Lakes
Westminster
Sherrelwood
Welby
Commerce City
Derby
Thornton
Federal Heights
Northglenn
Aurora
649,495
4,184
30,255
5,664
5,987
13,925
41,737
142,980
5,170
30,166
111,707
11,207
171
106,114
18,287
14,846
45,913
7,685
118,772
11,973
35,789
345,803
Detroit
Detroit
Lincoln Park
Dearborn
Melvindale
Dearborn Heights
Highland Park
Hamtramck
Grosse Pointe Woods
Harper Woods
Grosse Pointe Farms
Grosse Pointe
Grosse Pointe Park
681,090
38,144
95,884
10,525
57,774
11,629
22,423
15,838
13,990
9,316
5,326
11,345
Houston
Houston
Seabrook
Kemah
Webster
2,195,914
11,952
3,334
10,400
2,435,949
1,757,830
973,284
55
Friendswood
Pearland
Fresno
Sugar Land
Greatwood
Rosenberg
Richmond
Pecan Grove
Mission Bend
Cinco Ranch
Katy
Cypress
Jersey Village
Hunters Creek Village
Bellaire
Spring
Aldine
Tomball
Humble
Porter
Atascocita
Huffman
Crosby
Highlands
Channelview
Jacinto City
Galena Park
Deer Park
La Porte
Pasadena
South Houston
Sheldon
Barrett
Cloverleaf
Four Corners
Meadows Place
Missouri City
Fifth Street
Brookside Village
35,805
108,715
19,069
83,860
6,640
31,676
11,081
15,881
36,501
18,274
14,102
122,803
7,620
4,367
16,855
54,298
15,869
10,753
15,133
25,627
65,844
12,116
2,299
7,522
38,289
10,553
10,887
32,010
33,800
149,043
16,983
1,990
3,199
22,942
2,954
4,660
67,358
2,059
1,523
Indianapolis
Lawrence
Beech Grove
Warren
Franklin Township
Perry Township
843,393
46,001
14,192
1,239
54,594
108,972
3,362,560
Indianapolis
56
Decatur
Speedway
Wayne
Camby
Pike Township
Washington township
9,362
11,930
136,828
32,388
77,895
132,049
Jacksonville
Lakeside
Orange Park
Oakleaf Plantation
Bellair-Meadowbrook Terrace
Atlantic Beach
Neptune Beach
Jacksonville Beach
Ponte Vedra Beach
Sawgrass
Palm Valley
Baldwin
Nassau Village-Ratliff
Callahan
821,784
30,943
8,412
20,315
13,343
Las Vegas
North Las Vegas
Whitney
Winchester
Paradise
Henderson
Spring Valley
Summerlin South
Enterprise
Nellis AFB
Sunrise Manor
Blue Diamond
583,736
216,961
38,585
27,978
223,167
257,729
178,395
24,085
108,481
2,187
189,372
290
1,468,843
Jacksonville
12,895
7,124
21,823
37,924
4,942
19,860
1,430
5,337
962
1,007,094
Las Vegas
1,850,966
Los Angeles
Los Angeles
Santa Monica
Marina del Rey
Beverly Hills
Culver City
Inglewood
Burbank
La Crescenta Montroes
La Canada Flintridge
Glendale
57
3,884,307
89,736
8,866
34,290
38,883
109,673
103,340
19,653
20,246
196,021
Pasadena
East Los Angeles
South Pasadena
San Marino
Vernon
Huntington Park
Bell
Bell Gardens
Florence-Graham
South Gate
Lynwood
Compton
Willowbrook
Long Beach
Carson
West Carson
View Park-Windsor Hills
Westmont
Lennox
Hawthorne
Gardena
El Segundo
Manhattan Beach
Redondo Beach
Torrance
Lomita
Rolling Hills
Palos Verdes Peninsula
Rancho Palos Verdes
Signal Hill
137,122
126,496
25,619
13,147
112
58,114
35,477
42,072
63,387
94,396
69,772
96,455
35,983
462,257
91,714
21,699
11,075
31,853
22,753
84,293
58,829
16,654
35,135
66,748
147,478
20,256
1,860
Louisville
New Albany
Clarksville
Jeffersonville
Oak Park
Buckner
Crestwood
Mt Washington
Hillview
Brooks
Shepherdsville
Shively
St Matthews
Lyndon
Northfield
609,893
36,372
21,724
44,953
5,379
4,000
1,999
9,117
8,172
2,401
11,222
15,157
15,852
11,002
970
41,643
11,465
6,428,879
Louisville
58
Rolling Hills
Anchorage
Middletown
Hurstbourne
Memphis
West Memphis
Bartlett
Lakeland
Germantown
Collierville
907
2,264
7,218
3,884
653,450
26,245
55,055
12,430
39,161
46,462
Miami
Miami
Coral Gables
Coral Terrace
West Miami
Miami Springs
Gladeview
Hialeah
West Little River
El Portal
Miami Shores
419,777
49,631
24,380
5,965
13,809
14,468
224,669
34,699
2,325
10,493
Milwaukee
Milwaukee
Shorewood
Whitefish Bay
Glendale
Brown Deer
Bayside
Wauwatosa
West Allis
Greenfield
Greendale
Hales Corners
599,164
13,162
14,137
12,872
12,088
4,411
47,068
60,732
37,072
14,325
7,746
Naples
Naples
Vineyards
Golden Gate
Lely
Naples Manor
Lely Resort
Pelican Bay
Naples Park
East Naples
19,537
3,375
23,961
3,451
5,562
4,646
6,346
5,967
22,951
Nashville
Nashville
Ashland City
626,681
4,541
Memphis
832,803
800,216
822,777
95,796
59
Millersville
Goodlettsville
Hendersonville
Mt Juliet
7,471
16,813
54,068
28,222
New Orleans
New Orleans
Marrero
Harvey
Gretna
Terrytown
Timerlane
Arabi
Chalmette
Meraux
Violet
St Bernard
378,715
33,141
20,348
17,736
23,319
10,243
8,093
17,119
10,192
8,555
43,482
New York
New York
8,405,837
Oklahoma
Oklahoma
Mustang
Yukon
Bethany
Piedmont
Edmond
The village Nichols Hills
Moore
Del City
Midwest City
Spencer
Jones
Choctaw
Harrah
McLoud
610,613
17,395
22,709
19,563
5,720
81,405
3,710
55,081
21,332
54,371
3,746
2,517
15,205
5,095
4,044
737,796
570,943
8,405,837
922,506
Orlando
Orlando
Clarcona
Pine Hills
Orlovista
Doctor Phillips
Williamsburg
Hunters Creek
Oak Ridge
Pine Castle
Conway
Belle Isle
244,483
2,990
60,076
6,123
10,981
7,646
14,321
22,685
10,805
13,467
5,988
60
Taft
Meadow Woods
Azalea Park
Winter Park
Goldenrod
Eatonville
Fairview Shores
2,205
25,558
12,556
29,203
12,039
2,159
10,239
Philadelphia
Philadelphia
Westville
Gloucester City
Mt Ephraim
Bellmawr
Barrington
Haddonfield
Collingswood
Camden
Cherry Hill
Pennsauken Township
Maple Shade Township
Riverton
Cinnaminson
Cheltenham
Glenside
Abington
Wyncote
Jenkintown
Rockledge
Flourtown
Wyndmoor
Plymouth Meeting
Darby
1,553,165
4,288
11,402
4,676
11,540
6,983
11,507
13,850
76,903
71,722
35,830
19,043
2,779
16,763
4,810
8,384
55,234
3,044
4,422
2,550
4,538
5,498
6,177
10,687
Phoenix
Phoenix
Tolleson
Glendale
Peoria
Sun City
Tempe
Guadalupe
1,445,632
6,756
226,721
162,592
37,499
161,719
6,072
Pittsburgh
Pittsburgh
Homestead
Whitaker
Munhall
Brentwood
305,841
3,165
1,271
11,380
9,643
493,524
1,945,795
2,046,991
61
Whitehall
Castle Shannon
Mt Oliver
Dormont
Scott Township
Green Tree
Carnegie
Ingram
Crafton
Rosslyn Farms
McKees Rocks
Stowe Township
Avalon
Bellevue
Reserve Township
Millvale
Sharpsburg
Aspinwall
Wilkinsburg
Edgewood
Rankin
Braddock
13,938
8,316
3,403
8,593
17,024
4,431
7,972
3,330
5,951
427
6,104
6,362
4,705
8,370
3,333
3,744
3,446
2,801
15,930
3,118
2,122
2,159
466,879
Portland
Portland
609,456
Providence
Providence
North Providence
Cranston
177,994
32,078
80,387
Raleigh
Raleigh
Cary
431,746
151,088
Richmond
Richmond
Bon Air
Bensley
East Highland Park
Lakeside
214,114
16,366
5,819
14,796
11,849
Sacramento
Sacramento
Rio Linda
North Highlands
Arden-Arcade
La Riviera
Rosemont
Parkway-South
mento
475,122
15,106
42,694
92,186
10,802
22,681
36,468
609,456
290,459
582,834
262,944
62
Sacra-
Florin
Vineyard
47,513
24,836
Salt Lake
Salt Lake
South Salt Lake
186,440
24,366
San Antonio
San Antonio
Somerset
Macdona
Helotes
Leon Valley
Terrell hills
Castle Hills
Kirby
Shavano Park
Windcrest
Converse
Live Oak
Universal City
Adkins
Cibolo
Northcliff
Garden Ridge
Fair Oaks ranch
1,409,019
1,550
559
7,341
10,151
4,878
4,116
8,673
3,035
5,364
18,198
9,156
N/A
N/A
19,580
1,819
1,882
5,986
San Diego
San Diego
Chula Vista
National city
Bonita
La Presa
Coronado
Spring Valley
La Mesa
Rancho San Diego
El Cajon
Santee
Granite Hills
Winter Gardens
Lakeside
Poway
Fairbanks ranch
Rancho Santa Fe
Encinitas
Solana Beach
Del Mar
Escondido
1,345,895
243,916
58,582
12,538
34,126
24,697
28,205
57,065
21,208
99,478
53,413
3,035
20,631
20,648
47,811
3,148
3,117
59,518
12,867
4,161
143,911
767,408
210,806
1,511,307
63
2,297,970
San Francisco
San Francisco
837,442
San Jose
San Jose
Sunnyvale
Santa Clara
Fruitdale
Campbell
Saratoga
Los Gatos
Morgan Hill
East Foothills
Milpitas
1,000,536
140,081
116,468
935
39,349
29,926
29,413
37,882
8,269
70,092
Seattle
Seattle
White Center
608,660
13,495
St Louis
St Louis
Castle Point
Bellefontaine Neighbors
Jennings
Normandy
Northwoods
Pine Lawn
319,294
3,962
10,828
14,712
5,008
4,208
3,261
Tampa
Tampa
Town ’N’ Country
Egypt Lake-Leto
Greater Carrollwood
Lake Magdalene
Cheval
Greater Northdale
Lutz
Thonotosassa
Temple Terrace
Del Rio
Mango
Seffner
Brandon
Palm River-Clair Mel
Progress Village
Gibstonton
347,645
78,442
35,282
N/A
28,509
10,702
22,079
19,344
13,014
24,541
N/A
11,313
7,579
103,483
21,024
5,392
14,234
Virginia Beach
Virginia Beach
Greenbrier East
448,479
N/A
837,442
1,472,951
622,155
361,273
742,583
448,479
64
Washington
Washington
Bethesda
Silver Spring
Friendship Village
Takoma Park
Hyattsville
Coral Hills
Suitland-Silver Hill
Hillcrest Heights
Marlow Heights
Temple hills
Alexandria
Arlington
658,893
63,374
76,716
4,512
16,715
17,865
9,895
33,515
16,469
5,618
7,852
148,892
207,627
1,267,943
Table B.1: Administrative units that overlap with our amenities data and their
respective population taken from Wikipedia.
65
66
Bibliography
[1] Bettencourt, Luís MA, et al. "Growth, innovation, scaling, and the pace of
life in cities." Proceedings of the national academy of sciences 104.17 (2007):
7301-7306.
[2] Bettencourt, Luis, and Geoffrey West. "A unified theory of urban living."
Nature 467.7318 (2010): 912-913.
[3] Ortman, S. G., et al. "Settlement Scaling and Increasing Returns in an Ancient Society." (2014).
[4] Jacobs, Jane. The death and life of great American cities. Vintage (New
York), 1961.
[5] Jacobs, Jane. Cities and the Wealth of Nations. Harmondsworth, UK: Penguin, 1986.
[6] Alexander, Christopher, S. Ishikawa, and M. Silverstein. "Pattern languages."
Center for Environmental Structure 2 (1977).
[7] Mumford, Lewis. "The city in history. its origins, its transformation, and its
prospects." (1961).
[8] Glaeser, Edward. "Triumph ofthe City." How our Greatest Invention (2011).
[9] B.
Thau
(2014).
"How
big
data
helps
chains
like
Starbucks
pick
store
locations".
Retrieved
from:
http://www.forbes.com/sites/barbarathau/2014/04/24/how-big-datahelps-retailers-like-starbucks-pick-store-locations-an-unsung-key-to-retailsuccess/2/
[10] Giuliano, Genevieve, and Kenneth A. Small. "Subcenters in the Los Angeles
region." Regional science and urban economics 21.2 (1991): 163-182.
67
[11] Cladera, Josep Roca, Carlos R. Marmolejo Duarte, and Montserrat Moix.
"Urban structure and polycentrism: towards a redefinition of the sub-centre
concept." Urban Studies 46.13 (2009): 2841-2868.
[12] Hollenstein, Livia, and Ross Purves. "Exploring place through usergenerated content: Using Flickr tags to describe city cores." Journal of Spatial
Information Science 1 (2015): 21-48.
[13] Toole, Jameson L., et al. "Inferring land use from mobile phone activity."
Proceedings of the ACM SIGKDD international workshop on urban computing. ACM, 2012.
[14] Pei, Tao, et al. "A new insight into land use classification based on aggregated mobile phone data." International Journal of Geographical Information
Science 28.9 (2014): 1988-2007.
[15] Krueger, Samuel Glendening. "Delimiting the Postmodern Urban Center:
An analysis of urban amenity clusters in Los Angeles. University of Southern
California", 2012.
[16] Glaeser, Edward L., and Joshua D. Gottlieb. "The wealth of cities: Agglomeration economies and spatial equilibrium in the United States". No. w14806.
National Bureau of Economic Research, 2009.
[17] Fujita, Masahisa, and Paul Krugman. "The new economic geography: Past,
present and the future." Papers in regional science 83.1 (2004): 139-164.
[18] Hidalgo, César A., and Ricardo Hausmann. "The building blocks of economic complexity." Proceedings of the National Academy of Sciences 106.26
(2009): 10570-10575.
[19] Bettencourt, Luís MA, et al. "Urban scaling and its deviations: Revealing
the structure of wealth, innovation and crime across cities." PloS one 5.11
(2010): e13541.
[20] Gomez-Lievano, Andres, HyeJin Youn, and Luis MA Bettencourt. "The
statistics of urban scaling and their connection to Zipf?s law." PLoS One 7.7
(2012): e40393.
[21] Bettencourt, Luís MA. "The origins of scaling in cities." Science 340.6139
(2013): 1438-1441.
[22] Hidalgo, César A., et al. "The product space conditions the development of
nations." Science 317.5837 (2007): 482-487.
68
[23] Maes, Pattie. "Agents that reduce work and information overload." Communications of the ACM 37.7 (1994): 30-40.
[24] Resnick, Paul, and Hal R. Varian. "Recommender systems." Communications of the ACM 40.3 (1997): 56-58.
[25] Salesses, Philip, Katja Schechtner, and César A. Hidalgo. "The collaborative
image of the city: mapping the inequality of urban perception." PloS one 8.7
(2013): e68400.
[26] Naik, Nikhil, et al. "Streetscore–Predicting the Perceived Safety of One
Million Streetscapes." Computer Vision and Pattern Recognition Workshops
(CVPRW), 2014 IEEE Conference on. IEEE, 2014.
[27] Gonzalez, Marta C., Cesar A. Hidalgo, and Albert-Laszlo Barabasi. "Understanding individual human mobility patterns." Nature 453.7196 (2008):
779-782.
69
Download