Methodological Issues in GIS-Based Environmental Justice
Department of Geography, The Pennsylvania State University,
302 Walker Building, University Park, PA 16802
Abstract. Research in environmental justice investigates whether certain
unempowered segments of the population, typically minorities and/or the poor,
bear a disproportionate burden of environmental risk. Geographic information
systems (GIS) have been used to carry out ‘conventional’ statistical approaches
to environmental justice research whereby the socioeconomic character of
communities that host environmentally hazardous facilities are compared to nonhost communities. However, methodological issues associated with the
conventional approach, such as scale of analysis, continue to make GIS-based
statistical assessments of evidence of environmental injustice problematic. GIS
has the potential to mitigate many of these methodological problems through
mapping/visualization, improved modeling of environmental risk, multi-scale
analysis, and raster surface-based representations of population. The case
study presented in this paper, concerning environmental injustice in the
Philadelphia, Pennsylvania region, demonstrates how raster GIS can facilitate
the investigation of the relationship between the distribution of demographic
character and the location of hazardous facilities across a variety of scales of
analysis. This study finds that in the Philadelphia region there is a clear and
predictable relationship between socioeconomic status and proximity to
environmentally hazardous facilities that can be interpreted as evidence of
environmental injustice. By using GIS to create improved representations of
population character and environmental risk, environmental justice research can
move beyond the simple statistical comparison of groups of areal units to the
exploration of demographic patterns and their spatial relationship with the
distribution of environmental risk.
Research in environmental justice research investigates whether certain
unempowered segments of the population, typically minorities and/or the poor, bear a
disproportionate burden of environmental risk. The recent attention on environmental justice
can be traced to the release of studies by the U.S. General Accounting Office (GAO, 1983)
and the United Church of Christ’s Commission for Racial Justice (CRJ, 1987) which reported
evidence of racially-based discrimination in the locational distribution of environmentally
hazardous sites, such as waste treatment, storage, and disposal facilities. Subsequently, the
U.S. Environmental Protection Agency (EPA) has recognized the need to mitigate racial and
economic discrimination in the siting of environmentally hazardous facilities (EPA, 1992).
Since these two ‘early’ studies, much statistically-based research has investigated the
issue of environmental justice at local, regional, and national scales. Some of these studies
have challenged the claims of the landmark CRJ (1987) study as inaccurate and misleading
due to the choice of data and methodology, the Anderton et al. (1994) article being perhaps
the most prominent. However, all spatial statistical analyses of environmental justice
necessarily make some assumptions concerning the data and methodology used in the
study. These data and methodology issues broadly concern two representational themes: 1)
the definition and measurement of environmental risk and 2) the definition and spatial
delineation of ‘community.’
Many studies, including those on both ‘sides’ of the environmental justice debate,
have used geographic information systems (GIS) to manage and structure environmental
justice analyses. The benefits of using GIS for environmental justice research are relatively
straightforward: Environmental justice is an inherently spatial (and temporal) issue (i.e. what
is the spatial relationship between the distribution of people and environmental risk) and GIS
provides an efficient environment for the management, analysis, and display of spatial
environmental justice data. However, GIS software and GIS data also adhere to particular
models of the real world that impose representational and methodological constraints and
assumptions on the way environmental justice is understood and therefore analyzed. Many
of these methodological issues lie at the foundation of the dispute over the interpretation of
statistical evidence of environmental injustice
Unfortunately, the methodological choices made by environmental justice researchers
often go unacknowledged in the interpretation of evidence of environmental injustice. The
purpose of this paper is to describe the methodological issues associated with using GIS in
environmental justice research so that these issues may be brought to the forefront of the
environmental justice debate. While there are no methodological ‘solutions’ that would
create a completely accurate and objective assessment of environmental justice, there is
certainly value in incorporating the impacts of the methodological assumptions and
constraints into the interpretation of study results. The remainder of this paper reviews GISbased environmental justice research, highlights primary methodological issues, and
proposes a novel environmental justice GIS analysis method that is applied to the
Philadelphia, Pennsylvania region as a case study.
GIS and the ‘conventional’ approach to environmental justice research
In nearly all statistically oriented environmental justice studies, justice is defined
according to whether the environmentally hazardous facilities in a particular region are
spatially distributed in a socioeconomically equitable versus inequitable manner. This
environmental equity approach to measuring environmental justice generally entails
identifying those communities that host environmentally hazardous facilities (however
‘community’ may be defined), tallying the racial and economic character of those host
communities, and comparing that socioeconomic character to those communities in the
region that do not host environmentally hazardous facilities (or to the character of the region
at large). Evidence of injustice is then defined as when communities that host
environmentally hazardous facilities have significantly higher rates of minority and/or poor
persons than non-host communities.
This type of analysis is easily implemented in a GIS using U.S. Bureau of the Census
demographic and boundary data, hazardous facility data derived from publicly available U.S.
Environmental Protection Agency (EPA) databases, and basic statistical functions found in
most commercial GIS packages. For example, Glickman et al. (1995) use GIS to examine
evidence of environmental injustice in Allegheny county, Pennsylvania, which includes the
city of Pittsburgh. These authors investigate the spatial relationship between statistically
derived socioeconomic status and proximity to Toxic Release Inventory (TRI) facilities listed
in the EPA TRI database as the indicator of environmental injustice. The TRI database is
composed of manufacturers that are required by law to report to the EPA certain toxic
chemicals that they release to the environment. While the TRI database is certainly not a
comprehensive source of information for a region’s environmental risk, as Glickman et al.
(1995) note, it is easily obtainable and is often used in environmental justice investigations.
Glickman et al. (1995) define community using five different spatial delineations:
census block group, census tract, municipality, and half-mile and one-mile distance ‘buffers’
around each TRI facility. Averages of census socioeconomic variables for each of these
community zone schemes were calculated, including percent minority, percent living below
the poverty line, percent unemployed, percent over the age of 65, percent under the age of 5,
and other census variables that indicate socioeconomic status or at risk populations.
Glickman et al. (1995) report mixed, sometimes contrary results concerning evidence of
injustice. For instance, when communities are defined by census block groups or tracts, the
percentage of minorities in TRI host communities is not significantly different than that in nonhost communities. However, when municipalities, a generally larger areal unit than block
groups or census tracts, form the basis for defining community, TRI-host communities have
significantly higher proportions of minorities than non-TRI-host communities.
These results mirror those found in other studies and indicate one of the primary
methodological issues in environmental justice research, the spatial delineation of community
and scale of analysis. The CRJ (1987) study was criticized by Anderton et al. (1994) for
using zip codes as the areal unit of analysis because these authors felt that zip codes are too
large to capture the spatial relationship between socioeconomic status and proximity to
hazardous facilities. Instead, Anderton et al. (1994) use census tracts and find that
minorities and the poor are not more likely than non-minorities and the non-poor to live in a
census tract that hosts a hazardous facility. They therefore conclude that their study finds no
evidence of environmental injustice. Significantly, however, these authors did find a positive
relationship between disadvantaged socioeconomic status and proximity to hazardous
facilities within a 2.5 mile radius of hazardous facilities.
Goldman and Fitton (1994), in a follow-up to the original CRJ (1987) study, note that
although the CRJ (1987) and Anderton et al. (1994) studies reach opposite conclusions
about the evidence of environmental injustice, their statistical results suggest a similar, and
somewhat startling, demographic pattern: ‘bands’ of socioeconomically disadvantaged
persons surrounding a ‘core’ of non-socioeconomically disadvantaged persons concentrated
around environmentally hazardous facilities. It is open to debate whether this pattern
represents a typical demographic scenario, or even if it does, whether it is, in fact, evidence
of environmental injustice. However, it is worth noting that the political motivation to
‘objectively’ demonstrate the existence or non-existence of environmental equity often
subsumes and sabotages the analysis itself by biasing the interpretation of analytical results
(Pulido, 1996) (Anderton et al. (1994) were funded by Waste Management Incorporated, a
waste industry organization).
GIS innovations in environmental justice research
Other GIS-based environmental justice studies attempt to expand the conventional
approach to the statistical analysis of environmental justice by using the analytical and
display capabilities of GIS. Burke (1993) investigates environmental equity in Los Angeles
by using various mapping and visualization schemes to expose the subtle relationships
between race, class, population density, and the location of TRI facilities. This author finds
evidence that “at a given income level, Hispanics and African-Americans are more likely to
be living in close proximity to TRI facilities than whites or Asians” (Burke, 1993: 50).
Typically, environmental justice studies do not attempt to explicitly define the spatial
distribution of environmental risk as it is an extremely complex task which differs according to
type of facility, type of toxic release, and a host of environmental variables that control the
dispersion of the toxic material through the environment. Instead, most studies simply
consider the people in the ‘community’ (whether defined by census-based areal unit or
distance buffer) that hosts the hazardous facility to be at risk. Chakraborty and Armstrong
(1997) use GIS to improve on the definition of at risk population by delineating the areas
surrounding each toxic facility that are most likely affected by toxic releases based on a
numerical model of toxic dispersion. This model generates a ‘plume’ footprint that defines an
at risk area within which socioeconomic variables may be tallied.
Chakraborty and Armstrong (1997) also explore the impact of using different
representations of population data in environmental justice analyses. Usually, population
data are represented by assignment to polygonal areal units. For distance buffer
approaches to defining community or at risk population, polygonal population data in certain
GIS packages are considered within the distance buffer if any portion of the areal unit
overlaps with the buffer. Chakraborty and Armstrong (1997) refer to this method as the
polygon containment method. This method may lead to misleading calculation of within-
buffer population character since the people living within the overlapping areal unit may in
actuality be concentrated in a particular portion of the areal unit that is not actually within the
distance buffer.
Zimmerman (1994) notes that GIS methods can be developed to partition the
population data assigned to an areal unit that is only partially within a distance buffer into
inside-the-buffer and outside-the-buffer portions based on the percentage of the of the areal
unit that lies within and without the distance buffer, respectively. Chakraborty and Armstrong
(1997) refer to this method as the buffer containment method. However, this approach
assumes an homogeneous distribution of population throughout the areal unit. An alternative
is to represent population data as assigned to an areal unit centroid point, called the centroid
containment method (Chakraborty and Armstrong, 1997). If the centroid falls within the
distance buffer, the population data for the entire areal unit represented by that centroid is
considered within the buffer. Again, however, error may occur if the centroid falls within the
buffer but the actual population is concentrated in a portion of the areal unit outside the
I performed a brief comparative test of each of these population representation
methods in an analysis of the relationship between percent minority and distance to TRI
facility in Delaware county, Pennsylvania.. TRI sites in Delaware county are concentrated in
industrial and urban waterfront areas, many of which have high concentrations of minority
populations. Significantly, I found that the polygon containment and centroid containment
methods tended to under-represent the percentage of minorities living in very close proximity
to TRI sites as compared to the buffer containment method. These two former methods
were less sensitive to variation in demographic character at close proximities to TRI facilities
because they tended to incorporate more distantly located, non-minority populations than the
buffer containment method using the same distance buffer. In other words, percent minority
calculations at close proximities were diluted by the inclusion of a larger area with lower
concentration of minority population. Chakraborty and Armstrong (1997) reported similar
findings in their comparison of population representation methods.
Another very prominent issue in environmental justice research, related to the issue
of defining community, is that of scale of analysis. Scale of analysis concerns both the scope
of analysis, the region that the study covers, and the resolution of analysis, which generally
refers to the choice of areal unit at which demographic data is represented and tallied. For
instance, the CRJ (1987) study was done at the zip code resolution. However, this definition
of resolution is problematic because zip codes (and nearly all census-, or other organization-,
based zonation schemes) vary widely in their areal extent; they are typically much smaller in
urban areas than in rural areas.
This issue of choice of resolution in spatial analysis is associated with what is called
the modifiable areal unit problem (MAUP) in the geographic literature (Openshaw, 1983).
The MAUP refers to the fact that different aggregation and/or zonation schemes for spatial
data may result in vastly different spatial analysis results. The detrimental impact of the
MAUP on the analysis of census data is well established (Fotheringham and Wong, 1991;
Openshaw, 1984). The difference in results between the CRJ (1987) and Anderton et al.
(1994) studies may be attributed in part to the MAUP.
A number of authors (Anderton et al., 1994; Glickman et al., 1995) argue that there
exists an ‘appropriate’ areal unit of analysis, or that evidence of environmental equity must
not vary with the scale of analysis in order to be regarded as valid. However, simply
assuming that there is such a thing as an ‘appropriate’ unit of analysis for environmental
justice research immediately violates the principles of the MAUP. Sui (1999) notes that an
environmental justice study done at any one scale or based on one particular areal unit
cannot, by definition, produce a reliable indication of environmental justice or injustice; there
is no such thing as the single ‘best’ or most ‘appropriate’ scale of analysis in environmental
justice research.
A number of authors have suggested that GIS be used to support multi-scale
environmental justice analysis (McMaster et al., 1997; Sui, 1999). I argue that the purpose of
multiscale analysis is not to find the ‘best’ scale of analysis but to investigate how
demographic character and its spatial relationship with environmentally hazardous facilities
varies across scales. This information may indicate the subtle and complex demographic
patterns that lie at the root of the environmental justice debate. As Been (1995) notes,
environmental justice is infinitely more complex than disproportionate numbers of hazardous
facilities being sited in census tracts (or block groups, municipalities, etc.) with a high
percentage of minorities. Rather, environmental injustice should be viewed as a complex
intertwining of various socioeconomic characteristics distributed in certain spatial patterns. It
should be the goal of environmental justice studies to ‘uncover’ these often ‘hidden’ patterns
that are embedded in the social and environmental data that is available.
A case study: environmental injustice in the Philadelphia, Pennsylvania region
Nearly all GIS approaches to environmental justice research have been vector- (as
opposed to raster-) based because most commercial GIS are vector-based (although there
are a growing number of GIS packages offering raster data handling). In addition, most
population and hazardous facility data are also vector-based. However, raster modeling of
population offers many advantages. Principally, the raster-based approach to representing
population allows for data aggregation to nearly any areal unit, facilitates the exploration of
how demographic character varies across scales, and provides the means to create more
informative visualizations of the distribution of demographic character (Bracken, 1993; Martin
and Bracken, 1991).
Here, I describe a combined vector-raster analysis of environmental justice in
southeast Pennsylvania which encompasses the city of Philadelphia (which is identical to
Philadelphia county) and its four closest counties in Pennsylvania: Bucks, Delaware,
Chester, and Montgomery. The goal of this study is to understand the distribution of
socioeconomic character and its spatial relationship to environmentally hazardous facilities. I
hypothesize that socioeconomic character has a strong relationship with proximity to
hazardous facilities; in other words, the socioeconomic character of a location can be
predicted as a function of distance to a hazardous facility. I test this hypothesis by modeling
population as a raster surface. This allows for demographic variables that indicate
population character to be tallied within a series of distance buffers generated from the
hazardous facility locations. Regression is then used to test the strength of the relationship
between socioeconomic character and distance to hazardous facility.
Three demographic variables that are often used in environmental justice analyses
are used to indicate socioeconomic status in this study: number of minorities, number of
people living below the poverty line, and number of people over the age of 25 with a
bachelors or graduate degree. These population data were acquired from the U.S. Bureau of
the Census at the block group level. Data on facilities that store or release toxic materials in
the Philadelphia region were acquired from EPA databases including sites listed in the TRI
database as well as treatment, storage, and disposal (TSD) facility sites listed in the Biennial
Reporting System (BRS) database. Procedures for improving the locational accuracy of
these hazardous facility sites and eliminating redundant database listings were followed
according to Scott et al. (1997).
A variety of procedures for generating population surfaces from areal unit
demographic data have been proposed including areal weighting (Flowerdew et al., 1991),
interpolation from areal unit centroids (Bracken and Martin, 1989), and the use of remote
sensing imagery and dasymetric mapping (Langford and Unwin, 1994). Dasymetric mapping
is a technique that uses ancillary data to redistribute mapped thematic data in a more
accurate and logical way. It is used here to improve upon the methods of population data
representation that are typically used in environmental justice research. The dasymetric
mapping/raster surface generation method described here is a variation on the method
described by Langford and Unwin (1994) and uses urban density classification data derived
from satellite remote sensing to redistribute population within the original block group data
boundaries. This procedure was carried out using ArcView GIS by Environmental Systems
Research Institute (ESRI), Inc.
Urban density data for Pennsylvania were acquired from the Environmental
Resources Research Institute (ERRI) at the Pennsylvania State University. These data were
photointerpreted from Landsat Thematic Mapper (TM) imagery overlaid with a road network
to produce a polygon coverage that partitions the state into areas of high density urban, low
density urban, and non-urban. Note that ‘density’ in this case refers to the degree of
urbanization (i.e. development), not population density. While degree of urbanization is by
no means a perfect proxy for population distribution (Forster, 1985), its utility in modeling
population has been demonstrated in a variety of contexts (Langford et al., 1991; Mesev,
The urban density classification data were converted from vector to raster format with
a grid cell size of 100 meters. This resolution was chosen because it meets the analytical
requirements and yet is not so fine that it interferes unduly with processing time. Each grid
cell was assigned a population value according to three factors: the population of its host
block group, the population density of its urban density classification (derived from empirical
measurement), and the percentage of the area of the host block group occupied by its urban
density classification. This procedure preserves what Tobler (1979) referred to as the
pycnophylactic property: summing the population for all the grid cells within any block group
produces the same population figure as that originally assigned to that block group. The
raster surface generation calculations were carried out primarily in the ArcView GIS Tables
module and can be described mathematically as:
PGCu.c.b = (PCTu.c.b * PBG b) / GCu.b
Population assigned to one grid cell with urban density
classification u, in county c, and in block group b
Percent of population assigned to urban density
classification u, in county c, and in block group b
Number of grid cells (area in 10,000 sq. meter units) of
urban density classification u in block group b
Population of block group b
Each demographic variable was distributed homogeneously according to the
distribution of the total population for each block group. Surfaces of percent minority, percent
living below the poverty line, and percent over the age of 25 with a bachelors or graduate
degree were created by dividing the ‘count’ grids for each of these variables by the grid of
total population. For a more thorough description of this areal interpolation technique see
Mennis (forthcoming).
Distance buffers around each hazardous facility were created that described the area
within 500 meters of a hazardous facility, within 1000 meters, and so on up to 10,000 meters,
which encompasses 99.9% of the total population. Percent minority, percent living below the
poverty line, and percent over the age of 25 with a bachelors or graduate degree were then
tallied within each of these distance buffers. Cumulative tallies determine these variables
within 500 meters of a hazardous facility, within 100 meters, within 1500 meters, etc. while
zone tallies determine these variables within 500 to 1000 meters of a hazardous facility,
within 1000 to 1500 meters, and so on.
The relationship between presence of minorities and distance to hazardous facilities
is presented in figure 1. As distance to hazardous facilities increases, percent minority
decreases, percent living below the poverty line decreases (not shown), and percent over
age 25 with a bachelors or graduate degree increases (not shown). The break in slope at
approximately 5000 meters, evident in the graphs of all the variables, is related to the fact
that 92.0% of the total population and 98.1% of all minorities live within 5000 meters of a
hazardous facility.
Regression tests that predicted percent minority, percent living below the poverty line,
and percent over the age of 25 with a bachelors or graduate degree based on cumulative
distance to hazardous site up to 5000 meters yielded R2 values of 0.886, 0.907, and 0.926,
respectively. R2 values for these same
Percent Minority
Percent Minority by Cum ulative Distance to Toxic Site
variables, but predicted by zone distance to
hazardous site up to 5000 meters, yielded
values of 0.688, 0.886, and 0.979,
respectively. All results were significant at
Cum ulative Distance to Toxic Site (km )
the 0.001 level. Multiple stepwise regression
Percent Minority
Percent Minority by Distance to Toxic Site Zone
with cumulative distance to hazardous facility
up to 5000 meters as the dependent variable
and the three demographic variables as
independent variables excluded percent
Zone Distance to Toxic Site (km )
minority and percent living below the poverty
Figure 1. The relationship between percent
minority and cumulative distance to hazardous
site (top) and zone distance to hazardous site.
line and included percent over the age of 25
with a bachelors or graduate degree to yield
an R2 of 0.926 (significant at the 0.001 level). A similar test that predicted zone distance to
hazardous site up to 5000 meters excluded percent living below the poverty line and included
percent minority and percent over the age of 25 with a bachelors or graduate degree to yield
an R2 of 0.987 (significant at the 0.001 level). Clearly, the poor, minorities, and the lesser
educated tend to live in closer proximity to hazardous facilities than the non-poor, nonminorities, and the educated.
These results conjure an image in which each hazardous facility is surrounded by
poor, uneducated minorities and that gradually this pattern gives way to wealthier, educated
non-minorities as distance to the hazardous facility increases. However, maps that depict
the distribution of these variables overlaid with the locations of hazardous facilities
demonstrate that this is not at all the case (e.g. figure 2 which shows areas with percent
minority greater than the regional mean of 26%). There are, rather, various ‘clusters’ of
hazardous facilities that appear to correspond to a variety of interrelated historic, cultural,
and infrastructure factors. For instance, many hazardous facilities stretch along the
Delaware and Schuylkill Rivers while others are clustered around population centers.
This apparent, but in fact false,
discrepancy between statistical and visual
summation can be attributed to the difference
between the measurement of percent and
density of demographic character. For
example, while there are areas outside
Philadelphia with high percent minority, nearly
all minorities in the Philadelphia region are
Figure 2. The location of hazardous facilities
relative to percent minority.
clustered within certain neighborhoods of
Philadelphia (figure 3). While non-minorities
are also clustered around Philadelphia, they are
much less concentrated in specific areas. The
same is true with percent and density of people
living below the poverty line. Concerning
education, it appears that while higher education
attainment is concentrated in suburban areas,
hazardous facilities are concentrated primarily in
urban areas and secondarily in rural areas.
So while hazardous facilities are not
necessarily concentrated in poor, uneducated,
and minority portions of the greater Philadelphia
region, these portions of the population are
concentrated in one particular area, the city of
Philadelphia. Because the city of Philadelphia is
home to one of many clusters of hazardous
Figure 3. Density of minorities and nonminorities in the Philadelphia region.
facilities, nearly all those of unempowered
socioeconomic status are in relatively close
proximity to hazardous facilities compared to other persons of the Philadelphia region.
However, there are many non-poor, non-minorities, and educated persons who are also in
relatively close proximity to hazardous facilities.
Further statistical analysis and mapping/visualization may reveal other
demographic/hazardous facility patterns. For example, spatial autocorrelation measures
may indicate the degree of socioeconomic regionalization at a variety of scales. Cluster
analysis and point pattern analysis of the hazardous facility data may show a statistical
relationship between demographic character and spatial clusters of facilities. Choropleth and
bivariate mapping schemes, as well as cartograms, can be used to further visually
investigate the demographic patterns embedded in the data.
This paper is intended as both a caution and an encouragement for the use of GIS in
environmental justice research. On the caution side, the data representations that are
embedded within GIS present potential pitfalls to researchers who do not explicitly
acknowledge how GIS data and methods of analysis can control analytical results. While the
issue of making explicit an investigation’s analytical assumptions exists for nearly any
analysis, whether using GIS or not, the ease of use of many GIS often serves to make this
issue transparent to the casual user. On the encouragement side, however, GIS provides an
environment for creating new and innovative ways of investigating environmental justice.
The use of raster representations of population and environmental risk and the use of
advanced spatial statistical and visualization techniques hold particular promise in moving
environmental justice research forward towards a more exploratory, pattern recognition
