Assignment Five - Data Quality

advertisement
Karen Shilo
October 14, 2008
GIS Assignment Five
Barbara Parmenter
GIS Data Quality Assessment
Table of Data Layers:
Category
Layer
Time
Period
Source
1
Roads
Roads
3
Roads
Publication
date:
January
2003
"Enhanced
” from the
2000
census
Tiger
roads data
ca. 2005
http://www.mass.gov/mgis/cen2000_tiger.htm
2
Census2
000 Tiger
Arc Data
Set
StreetMa
p USA
4
Hydrogra
phy
not listed
on
MassGIS
website
M:\state\ma\MassGIS\Census\CENSUS2000TIGERHYDRO_
POLY.shp
5
Hydrogra
phy
ca. 2005
M:\state\ma\MassGIS\Physical_Resources/HYDRO25K_Poly
.shp
6
Land Use
ca. 1999
S:\classes\UEP_ENV\MassGIS_landuse_layer_files\21Landu
se_Solid_Color.lyr
7
Land
Cover
2001
S:\classes\UEP_ENV\NLCD landcover layer file\landcover.lyr
8
Subway
Nodes
M:\state\ma\MassGIS\Infrastructre\MBTA_Node.shp
9
Police
Stations
MassGIS
Police
Stations
not listed
on
MassGIS
website
Updated
February
2007
1
0
Orthopho
to
Cambridg
e
Orthopho
to
Imagery
2003
M:\city\cambridge\cambridgeGIS\Imagery\orthophotos2003\c
ambridge_sid\band_1
Road
Centerlin
es (Mass
GIS)
Hydrogra
phy from
the
Census
TIGER
data set
MassGIS
Hydro
25k water
bodies
MassGIS
land use
National
Land
Cover
Data
(NLCD)
MassGIS
Subway
Nodes
M: drive under
\Country\USA\ESRIDataMaps906\streetmap_usa
M:\state\ma\MassGIS\Infrastructre\EOTMAJROADS_ARC.sh
p
M:\state\ma\MassGIS\Infrastructre\PoliceStations_PT_MEMA
.shp
1. Provide one-paragraph description of the project you are using as a benchmark to assess the
data and what positional accuracy it will require (or what is good enough - think about how far
off the position could be and still work for the project needs).
The impetus behind my project is to gain more knowledge about the urban transit system in
Cambridge. I am using the following data layers: roads, hydrography, land use, land cover,
subway nodes, and proximity to police stations as benchmarks to assess quality of the data with
respect to Cambridge orthophoto imagery. The positional accuracy of the roads and
hydrography should be accurate within 5 meters, as the roads provide the framework upon which
other data layers are built. Positional accuracy for land cover and land use should also be
accurate within approximately 5 meters, as this distance could make the difference between
pavement and grass or the ability to build a bench next to a subway stop. I feel that the subway
nodes or stations should be 100% positionally accurate because all distances will be measured
relative to these Cambridge subway stations. The positional accuracy of the police stations
should be about 5 meters off as well. On the other hand, determining how far they should be
from the police stations would be a separate analysis and would involve the combination of
response time, distance between subway stops, and possibly other variables.
2. Briefly discuss the three different road centerline data sets in terms of their positional relation
to each other (look at how far apart they are at different points using the measure tool in
ArcGIS, and if there is consistency in the differences. Include some graphic examples to
illustrate your points. Which data set would be best for your project?
In this GIS map, concentrated on the Kendall/MIT area, one can view the relationships between
the various road centerline data sets. Three data sets are shown: Census 2000 Tiger Roads (in
red), Enhanced Census Streetmap USA roads (in pink), and MassGIS road data (in blue). It is
clear that the blue linework or MassGIS follows the roads in the Cambridge Orthophoto quite
accurately. The red and pink lines interestingly overlap and are quite far off. For example, the
distance between the MassGIS road and Census Tiger road was measured in ArcMap to be 58 ft.
One wonders, therefore, why the Enhanced Census Streetmap USA layers is in fact titled
‘enhanced’ if they appear to be the same as the Census 2000 Tiger Roads data layer. For the
reasons above it is clear then that MassGIS road data would better fit for the project.
3. Do the same as above for the two hydrography layers.
Below is the Kendall/MIT area along with the Census Tiger Hydrography Layer.
The positional accuracy of the Census Tiger Hydrography is off by approximately 72 meters
with regards to the Memorial Drive pass under the Longfellow Bridge - not at all visible in the
Census Tiger Hydrography Data Set. Broadway Ave., which continues on the Longfellow
Bridge is also cut off by the Charles River.
The data discrepancy can also be seen in the following map – an orthophoto showing MassGIS
road data underneath a 30 percent transparent Census Tiger Hydrography Layer.
One can compare the Census Tiger Hydrography Layer to the MassGIS Hydro 25k Layer shown
below.
In the MassGIS 25K Hydro Layer above, the same data discrepancy appears. Both the
Cambridge Parkway Connector and Main Street are completely covered by the MassGIS 25K
Hydro Layer. However, the MassGIS 25K Hydro Layer is more accurate in the sense that it does
show the Broad Canal – the fingerlike water body extending into Cambridge. When measured in
ArcMap, the positional accuracy of the Broad Canal in this data layer is within approximately 5
meters from the orthophoto.
4. Can you provide a quantitative assessment of positional accuracy for each of your data layers
(e.g., +/- 20 feet)? Why or why not?
In the data layers explored above, I have measured the following positional accuracy:
Category
Layer
1
2
3
Roads
Roads
Roads
Census2000 Tiger Arc Data Set
StreetMap USA
Road Centerlines (Mass GIS)
4
Hydrography
5
Hydrography
Hydrography from the Census
TIGER data set
MassGIS Hydro 25k water
bodies
Approximate
Positional
Accuracy
Relative to
Orthophoto
58 feet
58 feet
0 feet
(accurate
alignment)
72 meters
5 meters
Other than my own measurements, it would likely be difficult to retrieve additional published
information about positional accuracy in these data layers, likely because it would raise questions
about data layer accuracy that organizations may not have supported answers to. This being
said, it makes it even more important to take it upon oneself to compare data layers for
inaccuracies in a project.
5. Give a qualitative assessment of positional accuracy of each of the four optional layers
relative to the other layers (e.g., do streets run through buildings? are schools in the correct
location along a road?).
The following is a map featuring MassGIS Land Use along with the Cambridge Orthophoto
Imagery and MassGIS Major Roads.
Notice that the Commercial Land Use area (red) overlaps both Main and Ames streets. For this
reason it appears that the Land Use data is slightly shifted to the southeast. Furthermore, the
Industrial Land Use area overlaps with the Broad Canal by approximately 11 meters.
The following map features the National Land Cover Data Layer from 2001 along with the
Cambridge Orthophoto Imagery and MassGIS Major Roads.
Medium Intensity Development is shown in light purple, while High Intensity Development is
shown in dark purple. Notice that there are great discrepancies between the National Land Cover
Data Layer from 2001 and the Cambridge Orthophoto Imagery.
Box 1: High Intensity Development cuts midway through a building.
Box 2: High Intensity Development cuts a circular plaza.
Box 3: High Intensity Development cuts through a parking lot.
Box 4: High Intensity Development cuts through major roads.
The positional accuracy for the National Land Cover Data Layer from 2001 appears to be as far
as 20 to 60 meters off.
The following map features MassGIS MBTA Subway Nodes along with the Cambridge
Orthophoto Imagery and MassGIS Major Roads.
All three subway stations have good positional accuracy as one can see that the nodes are located
in their correct street locations.
One can see this again below in a closer view of the Kendall/MIT subway station:
The Kendall/MIT in fact is almost perfectly placed in this data layer, as it is not directly in the
middle of Main Street but rather where it is accessed on the sidewalk where cars are parked.
This is exactly how the subway is reached in reality (and the one I use for work)!
The following is a map showing police stations and their proximity to the MassGIS MBTA
Subway Nodes.
It is interesting to see that the only two police stations in the area are placed within 286 and 153
meters from Lechmere and Central subway stops respectively. It should also be noted that
Kendall/MIT does not have a police station in close proximity.
The positional accuracy of these police stations can be further viewed on the following page:
The police station location in the MassGIS Policy Stations Layer appears to be right on target
with the marker hovering directly over the building.
6. Are these optional layers appropriate for your project in terms of their positional accuracy?
Below is a table of the layers explored above.
Category
Layer
Approximate
Positional Accuracy
Relative to
Orthophoto
Land Use
MassGIS land
use
National Land
Cover Data
2001
~11 meters
MassGIS
Subway Nodes
MassGIS Police
Stations
Positionally Accurate
Land Cover
Subway Nodes
Police Stations
~20 and 60 meters
Positionally Accurate
Given the approximate positional accuracies listed above, the land use, subway node, and police
station layers would be suitable for a project seeking to improve the urban transportation spaces
in Cambridge. Since the MassGIS Land Cover Data Set can be between 20 and 60 meters off, I
may want to utilize a different Land Cover Data Set if a more accurate one exists. Another
option would be to augment this data with Land Surveys or other data published by the City of
Cambridge. If these resources were unavailable and the discrepancies in the positional accuracy
of the Land Cover Data Set were explained, MassGIS Land Cover Data Set could perhaps still be
used for the purpose of this project.
7. Completeness: Is each data set complete? (Does it cover the area question, are all relevant
features present, and is the attribute information complete for all features?)
Opening up the attribute table for each of the following layers, I found the following:
Category
Attribute
Information
1
Roads
All Present
2
Roads
All Present
3
Roads
All Present
4
Hydrography
All Present
5
Hydrography
All Present
6
Land Use
All Present
7
Land Cover
All Present
8
Subway Nodes
All Present
9
Police Stations
All Present
10
Orthophoto
All Present
All elements were complete for each of the attribute tables in these data sets.
8. Currency: Are the data up to date? How do you know the answer to this?
I retrieved the publication dates in the table below largely from MassGIS and other data layer
websites. The publication dates, ranging from 2001 to 2007, give me the sense that the data sets
are not updated very often, and it is rare to find a data set that is of the current year. The data
sets from 2001 could certainly raise questions in terms of data currency.
Category
Time Period
1
Roads
Publication date: January 2003
2
Roads
"Enhanced” from the 2000 census
Tiger roads data
3
Roads
ca. 2005
4
Hydrography
not listed
5
Hydrography
ca. 2005
6
Land Use
ca. 1999
7
Land Cover
2001
8
Subway
Nodes
not listed
9
Police
Stations
Updated February 2007
10
Orthophoto
2003
9. Attribute accuracy: provide a qualitative assessment of attribute accuracy for critical attribute
items (e.g., land use codes, street names and address ranges, school names, etc). How adequate
is the attribute information for your project needs?
The critical attributes were present in all layers. In the above data quality assessment, the issue
was not the lack of attributes, but rather the inadequate positional accuracy of several of the data
sets. It is important then to keep this in mind in the context of developing a project or analysis that positional discrepancy of data can be quite common and should be noted.
Download