Kiley_GIS_Final

advertisement
Tom Kiley
Intro to GIS
Final Paper
December 14, 2012
PROJECT DESCRIPTION
The goal of this project was to visualize population change in Springfield, Massachusetts, and
surrounding communities over the last 60 years. Springfield’s peak decennial Census population (in
1960) was approximately 174,000. Since then, the city has lost more than 21,000 inhabitants; its
population in 2010 was approximately 153,000. I wanted to see if population was simply shifting
outward from Springfield and, if so, how densely it was settled in the surrounding towns.
There are different methods for mapping population density that are intended to overcome limitations
associated with simply dividing population by the area of the corresponding city or town. (The
limitations of this approach become apparent when you look at the city of Chicopee, Springfield’s
northern neighbor, which is relatively densely settled on the west side but has an airport on the east
side and, therefore, few inhabitants.) One method is dasymetric mapping, which ESRI describes this
way:
A technique in which attribute data that is organized by a large or arbitrary area unit is more
accurately distributed within that unit by the overlay of geographic boundaries that exclude,
restrict, or confine the attribute in question. For example, a population attribute organized by
census tract might be more accurately distributed by the overlay of water bodies, vacant land,
and other land-use boundaries within which it is reasonable to infer that people do not live.1
I had briefly considered attempting a dasymetric analysis, but such an analysis is not necessary if
population data are available at a fine-grained geographic level, such as Census blocks. Census block
data are available from both MassGIS and NHGIS for 1990, 2000, and 2010, so as my thinking evolved, I
decided to limit my analysis to these years using blocks, along with land use data from MassGIS to show
residential development by lot size.
DATA SOURCES
To begin, I first downloaded select population and housing (100%) data for 1990, 2000, and 2010 from
NHGIS at the Census block level, selecting Massachusetts as my geographic extent. The screenshot on
the next page shows my selections. This request produces three CSV files – one for each year – along
with metadata in a text document (.txt). The first column of each CSV file is the join column, labeled
GISJOIN. I later downloaded Census block boundary files from NHGIS and joined them to the NHGIS
tables in ArcMap.
1
http://support.esri.com/en/knowledgebase/GISDictionary/term/dasymetric%20mapping
1
I later discovered that MassGIS has already produced shapefiles for Census blocks from 19902, 20003,
20104, each with an attribute table that already includes population. Because these shapefiles took up
less space, were easier to work with, and were already projected in the Massachusetts state plane
system, I ultimately used them instead of the NHGIS data. For a user working on an analysis of another
state, however, the NHGIS data are extremely useful.
After trying various options, I thought a dot density map offered the best representation of the way
population is distributed in Springfield and surrounding communities. I ultimately set each dot equal to
20 people.
For land use, MassGIS has a datalayer (LANDUSE_POLY) that classifies land uses in one of either 21 or 37
categories based on aerial photography.5 Because the 21-category classification system was available for
both 1971 (the earliest year of data) and 1999 (the latest year), I relied exclusively on that. Of those 21
categories, four are residential:
10
11
12
13
Multi-family
Smaller than 1/4 acre lots
1/4 – 1/2 acre lots
Larger than 1/2 acre lots
I began by creating a new field in the LANDUSE_POLY layer, labeled “Change.” Any polygon whose land
use in 1999 had changed since 1971 was assigned a -1; no change was assigned a 0. (The syntax was
simply CHANGE=“LU21_1999” <> “LU21_1971”).
I then selected all attributes assigned a -1 and created a clipped datalayer. From this clipped datalayer, I
then displayed only residential land uses (10-13). The result was a map showing new residential
development between 1971 and 1999 by lot size.
For the base map, I used the MassGIS datalayer TOWNS_POLY, which I clipped to include only
Springfield and the eight cities and towns that are contiguous with it (Agawam, Chicopee, East
Longmeadow, Hampden, Longmeadow, Ludlow, West Springfield, and Wilbraham). I also used MassGIS
datalayers for hydrography and roads. I used the clipped TOWNS_POLY to clip all of my other datalayers
to just my study area.
Finally, I felt it was critically important to show population change from 1960 to 2010, not just from
1990 to 2010. That is because Springfield’s most dramatic population loss occurred in those first 20
2
1990 metadata: http://www.mass.gov/anf/research-and-tech/it-serv-and-support/application-serv/office-ofgeographic-information-massgis/datalayers/census1990.html
3
2000 metadata: http://www.mass.gov/anf/research-and-tech/it-serv-and-support/application-serv/office-ofgeographic-information-massgis/datalayers/census2000.html
4
2010 metadata: http://www.mass.gov/anf/research-and-tech/it-serv-and-support/application-serv/office-ofgeographic-information-massgis/datalayers/census2010.html
5
Land use metadata are available from http://www.mass.gov/anf/research-and-tech/it-serv-andsupport/application-serv/office-of-geographic-information-massgis/datalayers/lus.html.
2
years, and moderated thereafter. NHGIS provides Census tract boundaries files back to 1960, and Social
Explorer provides historical population data by Census tract. I had serious concerns about these data,
however, which I will discuss later.
As a result, I simply downloaded decennial Census reports (in PDF format) from the Census website and
then created an Excel spreadsheet where I inputted the population counts for Springfield and the eight
surrounding cities and towns for 1960, 1970, and 2010 (1980, 1990, and 2000 population counts are
already included in TOWNS_POLY). Ultimately, I only needed 1960 and 2010, since I only looked at
change over that entire time period. Either way, however, it was a small file to create manually. I then
joined my Excel file to my clipped towns datalayer and created a new field, Acres, in order to show
population density by town (persons/acres) and population change by town between 1960 and 2010. I
then laid my dot density map (described above) over the population change map. The result, I hope, is a
clear representation of the fact that the fastest growth since 1960 has occurred in the lowest-density
places.
3
DIFFICULTIES ENCOUNTERED
I had significant concerns about data during this assignment, but the truth is that I do not know if there
were issues with data quality or issues with my manipulation of the data (or both), particularly when
joining the data with shapefiles. My concerns are:


As I’ve mentioned, both NHGIS and MassGIS make Census block data available for 1990, 2000,
and 2010, but the data appear to me differ substantially depending on which source you use. I
don’t know how to explain this. I noticed that the NHGIS data are projected in a different system
(Albers Conical) than the MassGIS data, but while I understand that this could affect how the
data appear on a map, I don’t see why it would change the values that appear in an attribution
table. After I downloaded the NHGIS data, I deleted all the data in all three CSV files that was
not from Hampden County (where Springfield is located), in order to make it more wieldy. But I
also don’t see why this should have affected the data for Hampden County. I don’t have an
explanation for the discrepancy, but as I’ve said, I ultimately chose to use the MassGIS files for
my analysis. Another benefit of doing this is that I trust MassGIS far more than I trust myself!
I also had concerns with the data that resulted when I joined NHGIS Census tract boundary files
with Social Explorer population data from 1960 and 1980. When I did this, a number of tracts in
the most densely settled part of Springfield appeared as having no population whatsoever –
zero people – in both years, but 1980 was worse. I think it’s simply because the fields were
missing the values. In the map on the next page of 1980 Census tracts, the lightest shade
indicates zero population – obviously not possible.
NOTE ABOUT PROJECTION
Because all of the data I used ultimately came from MassGIS (with the exception of data I created myself
in Excel and then joined to MassGIS shapefiles), all of the data were projected in NAD 1983 State Plan
Massachusetts Mainland FIPS.
4
CONCLUDING THOUGHTS
My biggest frustration was that Census block-level data were not available before 1990, but the most
dramatic change occurred between 1960 and 1980. I could have remedied this somewhat with Census
tract data, but was unable in my timeframe to overcome serious concerns about the data I was able to
find from NHGIS and Social Explorer. With more time, I would investigate the tract data issues and also
explore applications based on some of the readings I summarize below.
Nevertheless, I think the land use maps and dot density map make a persuasive case that Springfield has
suffered powerfully from suburbanization, and that virtually all residential growth over the last several
generations has occurred in the low-density areas surrounding Springfield. This has serious implications.
As Peter Calthorpe and others have pointed out, carbon emissions associated with households in lowdensity, single use districts are much higher than carbon emissions associated with households in highdensity, mixed-use districts. If we are going to satisfactorily address climate change, then we must alter
the residential land use patterns that have predominated all over the country in the last 60 to 75 years.
5
ANNOTATED CITATIONS
1. Eidlin, Eric. “What Density Doesn’t Tell Us About Sprawl.” Access, Number 37, fall 2010, pp. 2-9.
2. Ewing, Reid, Rolf Pendall, and Don Chen. “Measuring Sprawl and its Impact, Volume 1.” Smart Growth
America report, data unknown. Available for download from
http://www.smartgrowthamerica.org/documents/MeasuringSprawlTechnical.pdf.
I made a decision fairly early on not to do a sprawl analysis. A sprawl analysis should go beyond
density to also look at “neighborhood mix of homes, jobs, and services;” “strength of centers,
such as business districts,” and “accessibility via the street network”, according to Reid, Pendall,
and Chen. I think a full sprawl analysis would be an important way to extend my analysis of
population density and distribution.
In his paper, however, Eric Eidlin does describe methods for dealing with limitations in assessing
population density. These methods include measuring variation in the distribution of population
of a city or town using a gini coefficient (a measure from zero to one, where 1 indicates that
100% of the population resides in a single Census tract). Standard measures of density indicate
that the Los Angeles area is more dense than the New York City area, because the LA suburbs
are denser than the New York suburbs. But the gini coefficient for LA is 0.65, while it’s 0.77 for
New York, reflecting the high density of New York City itself. Given my small analysis area,
however, and the availability of block-level data, calculating gini coefficients is not necessary
here.
3. Holt, James, C.P. Lo, and Thomas Hodler. “Dasymetric estimation of population density and areal
interpolation of census data.” Cartography and Geographic Information Science, Vol. 31, No. 2, 2004, pp.
103-121.
Using the Atlanta metropolitan area as an example, this analysis demonstrates how dasymetric
estimation of population density can be used to overcome the limitations of choroplethic
estimation – namely, the Modifiable Areal Unit Problem (MAUP) that we discussed in class.
4. Wu, Shuo-sheng, Xiaomin Qiu, and Le Wang. “Population Estimation Methods in GIS and Remote
Sensing: A Review.” GIScience and Remote Sensing, Vol. 42, No. 1, 2005, pp. 58-74.
This paper assesses two broad sets of methods for mapping population. In the first set, areal
interpolation, Census data are available. Here, “source zone” data (where the source zone might
be Census tracts, for example) are aggregated to “finer-scale” raster data and then aggregated
for “target zones” – that is, the spatial unit the research wishes to analyze. I think one of my
biggest disappointments was that I ran out of time before I was able to figure out how to
successfully rasterize my data. This is something I’d like to return to.
The second set of methods the authors discuss comprises statistical methods used to model
population when Census data are not available (say, in years between decennial Censuses.) I’m
not sure this would have had applications for my particular analysis.
6
5. Wu, Shuo-sheng, Le Wang, and Xiaomin Qiu. “Incorporating GIS Building Data and Census Housing
Statistics for Sub-Block-Level Population Estimation.” The Professional Geographer, Vol. 60, No. 1, 2008,
pp. 121-135.
This article notes that the sizes of Census blocks – “the most fine-grained census population
data available to the public” – can vary greatly. This is absolutely true; in my analysis, in lowdensity areas Census blocks were sometimes not much smaller than Census tracts. The authors
describe a model for using GIS building data and Census housing data to estimate populations at
sub-block levels. Their formula looks like this:
Population=BuildingVolume/HousingUnitSpace*OccupancyRate*HouseholdSize.
I would be interesting in gaining better familiarity with this approach and applying it in future
analyses.
7
Download