GIS Data Quality Assessment

advertisement
10/17/2013
Introduction to Geographic Information Systems (UEP 0232-01)
Barbara Parmenter
Fall 2013
GIS Data Quality Assessment: Expansion of the Green Line –
What to consider when implementing a new station at College
Avenue/ Boston Avenue, Medford?
(Assignment 4)
Description of the project
With the extension of the Green Line to College Avenue/ Boston Avenue, Medford, some
questions automatically arise concerning the feasibility of such a project. Naturally, the new
stations will improve public transport in the areas that are so far underserved. In this case one of
the benefiters would be, among others, Tufts University. However, it is not so clear what needs
to be done in the area that is affected in order to make the station work. This concerns, for
instance, current land-uses, existing infrastructure or buildings in the surrounding area of the
future station. All this probably needs to be revised in order to make the area conforming and
working to the new stations. To give only some examples here, once the station is going to be
introduced, a higher share of pedestrians and cyclists can be expected for whom an appropriate
infrastructure needs to be built. Also, it can be expected that in order to have a more vibrant life
around the new station there should be more commercial use than there currently is. Thus, a
revision of the existing buildings and their current purpose needs to be undertaken. The new
station might bear some additional difficulties, since it is located very close to the boundaries of
Somerville and Medford. Because of this potentially blurred responsibility some special attention
will be paid to this issue. In the following I will give an overview of different layers I deemed as a
potential help, and, say whether these are really useful for the purpose of my project.
Potential Questions
In order to revise the area for future changes some things need to be sort out regarding the cities
boundaries, existing rail lines, existing buildings, existing road centerlines, and current land uses.
This will be necessary, since GIS layers bear different qualities, e.g. resolutions, accuracies,
currencies, etc. The following questions could arise from analyzing these features:
1. Where do different layers locate the existing road centerlines? Are there any changes
visible if map layers are used from different clearinghouses? This could be essential when
roads have to be taken away, need to be restructured or simply narrowed in order to fit
in the new station.
2. Where are hydrographical features located in the surrounding area of the future station?
Can any hazards be determined that could catastrophically affect the station? In case of
heavy rain, for instance, there could be a flooding that would destabilize the surrounding
region of the river. Thus, we need to make sure that the new station will be located in a
considerable distance of any hydrographical features
3. Where is the political boundary set for Medford and Somerville? Are there any areas
that show overlaps of the two municipalities or areas that show no correspondence to
either municipality? This could be important when potential changes have to be made.
Also, it has to be clear which municipality would be responsible for undertaking these.
4. Where
are
the
existing rail tracks
localized? This will
be essential in order
to implement the
new
needed
rail
for
tracks
the
Green Line. Since
the two new ones
are going to be built in addition to the old ones we need a certain accuracy in order to
know how much extra space is needed. This fact holds especially true for the area of the
new station, since there will be some extra space needed in order to implement the
station in between the two new tracks (see picture).
5. Where does ArcGIS locate
existing buildings? In order
to know how much space is
available the planners need
to know exactly where the
already existing buildings
are (see picture). Also, it is
important to know which
buildings exist in the surrounding area of the new station in order to, potentially, take
them away. Thus, questions like, what are the sizes and heights of these buildings, could
be of possible interest.
6. What are the current land uses in the area around the new station? In order to make the
area a more vivid one, see Davis Square, there is probably a need for more commercial
use. However, currently, there are a lot of buildings used by Tufts University or have a
residential purpose. Thus, it could be interesting for me to figure out where more
commercial use could be implemented.
Discussion of Different Data Layers
1.) Street Centerlines
For the Street Centerlines I used layers from two different
clearinghouses, one based on the TIGER 2000 and one on
the city of Medford/Somerville.
As can be seen for the layers
created by the city of Medford
and
Somerville,
these
perfectly
to
the
provided
by
MassGIS.
fit
orthphoto
The
centerlines for the streets are located where the actual streets are. What can be seen, though, is
that the lines do not match 100% where they should come together (where Somerville and
Medford abut). However, for our project this is not too dramatic, because the station will not be
built at the border of the two municipalities. If any restructuring of College Avenue needs to be
done, though, this error should be taken into consideration.
Also, both data sets are considerably old, the one for Medford is dating from 1990-1999 (“The
exact date of this dataset is not certain, but its ground condition is from 1990 – 1999.”) and the
one for Somerville from 2002. With regard to the positional accuracy, the City of Somerville
mentions: “The center of physical roadway pavement may or may not represent the center of
the road right of way. Road right of ways may taper or change width.” In general the two datasets
seem to be quite suitable for the purpose of my project, though, there is not a lot of information
available on positional accuracy or other constraints. Another, rather negative, aspect is the few
additional information that come with this layer. The only information that could be of use would
be the length of the street. Other than that we would not find anything useful in this layer (e.g.
street types that could help identifying the actual use and throughput)
If we compare this to the layer created by
the Census 2000 we see that the accuracy
for the TIGER data is not really that well. The
lines are quite far away from the actual
streets and show overlaps in some sections
with buildings or green spaces. This is not
explainable by the year the data was
collected (2000), as if in that time the streets
were located differently, but solely due to
the bad positional accuracy. Making use of
this data layer could lead to considerably
problematic outcomes, because the lines
are not reliable for any sort of micro-level
planning.
With regard to the positional accuracy the metadata says: “The positional accuracy varies with
the source materials used, but generally the information is no better than the established
national map Accuracy standards for 1:100,000-scale maps from the U.S. Geological Survey
(USGS). […]The level of positional accuracy in the StreetMap files is not suitable for high-precision
measurement applications such as engineering problems, property transfers, or other uses that
might require highly accurate measurements of the earth's surface.” I assume in my project the
latter statement holds true, and, thus, makes it difficult to get any use of this layer.
What has to be mentioned, though, is that the TIGER metadata provides a lot more detailed
attribute information, e.g. how the data was processed and where it comes from. Also, it provides
information on the street type (e.g. highways, major roads, local roads, minor roads, etc.) Hence,
one could make assumptions on the actual throughput and importance of the street, if any
changes would be considered. However, since the positional accuracy of this layer is, as shown,
pretty bad, we should not use this layer for any planning on a micro-level.
2.) Hydrography
For the hydrography I used layers
from MassGIS and the National
Atlas (ESRI Data Maps). For the
first layer, the one created by
MassGIS, it can be seen that the
layer covers quite well the
hydrography
surrounding
the
new station. However, some
places where there should be
river, according to the basemap, do not show coverage. Regarding the positional accuracy, the
metadata says: “Areas within many surface water supply watersheds have been enhanced by
using higher resolution streams and lakes from the MassDEP Wetlands datalayer, many areas
have also been field verified.” It also, covers additional information like how the data was
collected and processed. The layer was created in 2010, thus, being quite current. All in all, the
metadata provides good and current information, the layer shows a good positional accuracy,
covering most of the actual water areas. Potential buffer areas, that needed to be calculated,
would, thus, in most cases be quite precise. The only negative aspect here is that additional
information is missing. Thus, we do not know how much water it carries, what the likelihood for
a flooding is etc. This would be useful information for building the station in a safe environment.
The other layer shows the
National Atlas Water Feature
Lines (ESRI Data Maps). As
can be seen, the layer is quite
inconsistent, changing from
polygon
to
line
quite
randomly. Thus, more or less
starting with the highway
crossing the Charles River,
the river does not continue as
a polygon but merely as a
line. Because of this, the
distance from the location of the new station to the river would be calculated quite imprecisely,
and, thus, buffer zones would not represent the real distances. With respect to the positional
accuracy, the metadata says: “The geospatial part of this data set was originally extracted from
the individual 1:2,000,000-scale State boundary Digital Line Graph (DLG) files produced by the
U.S. Geological Survey which have a positional accuracy of 1,720 meters. It was updated several
times using various sources whose horizontal positional accuracies are unknown. […] Largest
scale when displaying the data is 1:1,000,000”. In addition to this positional inaccuracy comes
the obsolescence of the data that was collected from 1995-2002. Useful would, on the other
hand, be the differentiation in water types, like swamps, rivers, channels, ponds, etc. However,
in our area there seems to be only the Charles River that is in closer proximity to the future
station.
In conclusion, the MassGIS provides more current and more accurate data than the one from
National Atlas does. However, it does not give the opportunity to distinguish between different
kinds of hydrographical features. If that would be useful for our project is another question,
though. Further investigation would, anyways, be needed in order to make safe predictions on
flooding probabilities.
3.) Political Boundaries
For the political boundaries
I used the data provided by,
both, the City of Medford
and the City of Somerville.
As can be seen on the right
picture, there are several
shortcomings
when
it
comes to the positional
accuracy of the layers. First
of all, there is a mismatch concerning an area West of Tufts. Approximately four houses show no
cover of municipal authority. It would, thus, be unclear which city was responsible, in case any
changes would need to be made in this area (to be honest, though, this area is quite far away
from the place of action). However, when we look closer at the outermost area of the cities, we
see that there is quite a long stretch that is covered by both municipalities. Hence, an overlaps
of the political boundaries can be seen. If there were any changes that needed to be done due to
the new station, it would not be clear who is in charge for that. These areas are located quite
close to the new station which would make it very likely that changes are needed in land use or
zoning, for instance. Regarding the quality of the data I must say that zooming in on a 1:6000
scale (about 20 feet +/- accuracy) still shows some overlaps between the two cities. Thus, the
data quality is not the highest (no further information on the positional accuracy is provided in
the metadata).
In terms of currency it has to be said that the data was collected in the time between 1990 and
1999 (“The exact date of this dataset is not certain, but its ground condition is from 1990 - 1999.”)
for Medford whereas the data for Somerville is from 2005. Some errors, or rather mismatches,
in positional accuracy may also stem for this time lag. I can only speculate, but it seems to me
quite unlikely, though, that there were any changes concerning the boundaries during the last
decades. However, maybe more current data layers would show a higher quality, and, thus,
adequacy on where the actual boundaries are.
With respect to the attribute accuracy one needs to say that there are only very few extra
information provided by the layers. While both cities provide data on the size of their area, only
Medford gives information on their population. And yet, it would be useful for me to know how
many people are living in the two cities, if I was to take into consideration how many people were
affected by the new station. Thus, additional data and clearinghouses would need to be
consulted.
4.) Existing Rail Tracks
To show where existing rail tracks are, I used
the layer provided by MassGIS. In order to
implement the new rail tracks for the Green
Line we need to know exactly, where the old
tracks are located. This holds especially true
for the area of the new station, because extra
space is needed here in order to implement
the station between the two new tracks. As
can be seen on the picture to the right, the
overall positional accuracy is very good. The
illustrated tracks represent very well the
existing ones. In order to make more detailed statements, however, we need to zoom in.
On the picture to the right, we see that
even
on
a
scale
of
1:800
(approximately 2 feet +/- accuracy) the
existing
tracks
represented.
are
Taking
perfectly
also
into
consideration that the data layer is
quite new, the metadata says from
2008, we can make good use from this
MassGIS layer. For instance, we could
calculate the actual width available and
needed for implementing the station
and the two new tracks. All this is
clearly visible when zooming further in
into the map.
Also, the layer provides further useful information that could be useful when building the new
tracks. The metadata says: “The layer includes active passenger, freight, and MBTA Commuter
Rail and Rapid Transit railways, along with abandoned rail lines. In many instances there is more
than one track per rail line, and rail yards and spurs are included. […]CTPS added several
attributes pertaining to type of service, MBTA Commuter Rail status, rail line ownership, and
freight and passenger operation.” Thus, if we wanted to know what kind of lines are running here
(type of service) and to whom it belongs (ownership), we could look this up in the layer’s
information. This could be useful when consulting the company in order to sort out how a future
use of the area will look like. Assuming that the current rail lines will be somehow affected by the
implementation of the Green Line, it is useful to know that the line is run by the MBTA and used
by the MBTA and Amtrak. There are apparently two tracks in use and one that is abandoned or
not in use anymore. Future investigations would be needed in order to find out what this third
track’s purpose is.
5.) Existing Buildings
To show where the existing buildings are I
used again the data layer from MassGIS.
Using the layer showing the existing
buildings I wanted to find out how much
space is available for the new development
of this place. Looking at the picture to the
right, we see that from a zoomed out
perspective the buildings seem to be quite
well covered. Although we find two buildings
in the bottom that are not represented that
well (both of them belonging to Tufts
University), these would not be that much of
concern, since they are too far away from any
restructuring processes. Zooming more into
the map (picture to the right, scale of 1:800
(approximately 2 feet +/- accuracy), we see
that some buildings of concern are not very
well represented. Thus, suggestions for the
planned Burget Neighborhood Path could be
based on erroneous measurements. Also, we
see on the right of the picture something that
is not very well identifiable. However, having
a quick look in google maps we can suppose that
there is a building existing. Implementing the new
Green Line tracks could have an effect here, since
the MBTA probably needs some space of
expansion (be it either for the tracks, the station
or some other services). Thus, having the building on the map would be useful for future
planning.
As for the currency, the data was collected in 2002. Thus, it is possible that some building
extensions were made after that time (for instance, on the Tufts buildings that miss some more
precise representation). However, the data seems mostly up to date and can, thus, be used
without further doubts.
What other information can be deducted from the metadata? There is a pretty exact description of how
the data was acquired and processed (installation of a camera in an aircraft, flights and reflights,
calibration, edge matchings etc.). The metadata says: “For additional accuracy verification, static survey
points were collected, using static benchmarks where available. Thirty-four survey points within the
project boundary were selected to allow a statistical absolute elevation verification of the data. This data
set was then statistically compared to the project LIDAR DEM data after the combination of flight lines to
verify accuracy both horizontal and vertical. The RMSE (Root Mean Square Error) of the LIDAR DEM was
calculated using the ground GPS data to ensure that the vertical error was less than 0.15 m.” Thus, we
find very good background information on how the data was produced, and, could potentially detect
where errors were made.
However, we find very little additional information in the layer that could help our questions on the new
station. The only extra information provided is the shape size and the shape length. At least we can hereby
say, how much space is actually covered by the buildings (naturally, only if the measurements are
adequate). To give just one example, in order to implement the Burget Neighborhood Path we would
precisely need to know where the buildings are, how much space they require and how much space can
be used for things like trees, greenery and, maybe, bike racks. Additional information, like the height of
the buildings, is, unfortunately, missing. If available, this could have been used for planning the Path more
carefully and make it more fitting to its surrounding.
6.) Current Land Uses
In the last section I wanted to look closer at the current land uses. Thus, again I used the MassGIS
data layer. Hereby, I wanted to find answers on what kind of land uses we encounter mostly
around the station and whether it is possible to implement more commercial use. The
assumption was that there are probably a lot of buildings already used by Tufts University and
some other that have a residential purpose. However, some more commercial use is needed,
once the new station is introduced.
I mapped the area around the new
station according to the different
land uses. The metadata says: “The
MassGIS Land Use datalayer has 37
land use classifications interpreted
from 1:25,000 aerial photography.
The minimum mapping unit used
was one acre.” I reduced the
categories to the relevant ones that
can be found in this area. As can be
seen on the picture to the right, most of
the area surrounding the new station is
defined as residential area (yellow). In
closer
proximity,
naturally,
most
buildings are institutional, because they
are owned by Tufts University (blue). Also, the recreational areas (green) in the closer surrounding form
part of Tufts. As a central axis along Boston Avenue/ College Avenue we have a strip of commercial use
(red) that looks rather marginal on the map. In addition it stretches quite long instead of forming a cluster
(see, as one example for the latter, Davis Square). Further investigation would be needed in order to figure
out what the piece of open land (beige) behind the sport facilities is and whether any use could be made
of it.
When we zoom a little bit more into the place, we use a scale of 1:1,200 (approximately 3.5 feet +/-
accuracy), the red strip on the picture gives the impression of quite a big area. However, the red strip
mostly covers only the area where the tracks are going. Apart of that, there is, actually, not that much
space left. Also, there are only very few
buildings that seem to be inside the
commercial
use-zone.
In
closest
proximity to the new station it would be
only the three buildings below it.
Although, we might get an idea of what
the land is dedicated to, these maps do
not show any more precise information
of different types of commerce. If we
wanted to know what kinds of shops,
grocery stores or restaurants we could
find in this area, the MassGiS layer would
not help us in that, because it does not
allow any further differentiation in the
category of commerical use (unlike, for instance, with residential use). Thus, the land use layer is too
superficial to make any deeper conclusions on how many shops there are, what sizes they have and which
chains are located here. In order to do so, we would need further information, e.g. getting business data
through geocoding to make some assumptions on the existing and potential commercial activity in this
area.
Considering the positional accuracy we need to say that the different zones are quite imprecise. In the top
of the latter picture we see that two different land uses, institutional and commercial, cut through the
buildings owned by Tufts. Therefore, the map does not represent the actual status of the building. Also,
in the lower part of the picture we see that the red strip makes a curve where there is no further area for
commercial use available, but the Tufts sport facilities. Thus, on this scale we encounter some errors.
With respect to the currency, it needs to be said that the data was revised in 1999 the last time. Though,
most of the uses probably have not changed over time (e.g. institutional, recreational and residential), it
would, naturally, be better to find more current data. One feature that could be interesting for a time
comparision is that one could see how the land uses have changed since 1971. This could be especially
interesting, if land uses are going to be changed again in order to implement the new station in 2020.
Download