Karen Shilo October 14, 2008 GIS Assignment Five Barbara Parmenter GIS Data Quality Assessment Table of Data Layers: Category Layer Time Period Source 1 Roads Roads 3 Roads Publication date: January 2003 "Enhanced ” from the 2000 census Tiger roads data ca. 2005 http://www.mass.gov/mgis/cen2000_tiger.htm 2 Census2 000 Tiger Arc Data Set StreetMa p USA 4 Hydrogra phy not listed on MassGIS website M:\state\ma\MassGIS\Census\CENSUS2000TIGERHYDRO_ POLY.shp 5 Hydrogra phy ca. 2005 M:\state\ma\MassGIS\Physical_Resources/HYDRO25K_Poly .shp 6 Land Use ca. 1999 S:\classes\UEP_ENV\MassGIS_landuse_layer_files\21Landu se_Solid_Color.lyr 7 Land Cover 2001 S:\classes\UEP_ENV\NLCD landcover layer file\landcover.lyr 8 Subway Nodes M:\state\ma\MassGIS\Infrastructre\MBTA_Node.shp 9 Police Stations MassGIS Police Stations not listed on MassGIS website Updated February 2007 1 0 Orthopho to Cambridg e Orthopho to Imagery 2003 M:\city\cambridge\cambridgeGIS\Imagery\orthophotos2003\c ambridge_sid\band_1 Road Centerlin es (Mass GIS) Hydrogra phy from the Census TIGER data set MassGIS Hydro 25k water bodies MassGIS land use National Land Cover Data (NLCD) MassGIS Subway Nodes M: drive under \Country\USA\ESRIDataMaps906\streetmap_usa M:\state\ma\MassGIS\Infrastructre\EOTMAJROADS_ARC.sh p M:\state\ma\MassGIS\Infrastructre\PoliceStations_PT_MEMA .shp 1. Provide one-paragraph description of the project you are using as a benchmark to assess the data and what positional accuracy it will require (or what is good enough - think about how far off the position could be and still work for the project needs). The impetus behind my project is to gain more knowledge about the urban transit system in Cambridge. I am using the following data layers: roads, hydrography, land use, land cover, subway nodes, and proximity to police stations as benchmarks to assess quality of the data with respect to Cambridge orthophoto imagery. The positional accuracy of the roads and hydrography should be accurate within 5 meters, as the roads provide the framework upon which other data layers are built. Positional accuracy for land cover and land use should also be accurate within approximately 5 meters, as this distance could make the difference between pavement and grass or the ability to build a bench next to a subway stop. I feel that the subway nodes or stations should be 100% positionally accurate because all distances will be measured relative to these Cambridge subway stations. The positional accuracy of the police stations should be about 5 meters off as well. On the other hand, determining how far they should be from the police stations would be a separate analysis and would involve the combination of response time, distance between subway stops, and possibly other variables. 2. Briefly discuss the three different road centerline data sets in terms of their positional relation to each other (look at how far apart they are at different points using the measure tool in ArcGIS, and if there is consistency in the differences. Include some graphic examples to illustrate your points. Which data set would be best for your project? In this GIS map, concentrated on the Kendall/MIT area, one can view the relationships between the various road centerline data sets. Three data sets are shown: Census 2000 Tiger Roads (in red), Enhanced Census Streetmap USA roads (in pink), and MassGIS road data (in blue). It is clear that the blue linework or MassGIS follows the roads in the Cambridge Orthophoto quite accurately. The red and pink lines interestingly overlap and are quite far off. For example, the distance between the MassGIS road and Census Tiger road was measured in ArcMap to be 58 ft. One wonders, therefore, why the Enhanced Census Streetmap USA layers is in fact titled ‘enhanced’ if they appear to be the same as the Census 2000 Tiger Roads data layer. For the reasons above it is clear then that MassGIS road data would better fit for the project. 3. Do the same as above for the two hydrography layers. Below is the Kendall/MIT area along with the Census Tiger Hydrography Layer. The positional accuracy of the Census Tiger Hydrography is off by approximately 72 meters with regards to the Memorial Drive pass under the Longfellow Bridge - not at all visible in the Census Tiger Hydrography Data Set. Broadway Ave., which continues on the Longfellow Bridge is also cut off by the Charles River. The data discrepancy can also be seen in the following map – an orthophoto showing MassGIS road data underneath a 30 percent transparent Census Tiger Hydrography Layer. One can compare the Census Tiger Hydrography Layer to the MassGIS Hydro 25k Layer shown below. In the MassGIS 25K Hydro Layer above, the same data discrepancy appears. Both the Cambridge Parkway Connector and Main Street are completely covered by the MassGIS 25K Hydro Layer. However, the MassGIS 25K Hydro Layer is more accurate in the sense that it does show the Broad Canal – the fingerlike water body extending into Cambridge. When measured in ArcMap, the positional accuracy of the Broad Canal in this data layer is within approximately 5 meters from the orthophoto. 4. Can you provide a quantitative assessment of positional accuracy for each of your data layers (e.g., +/- 20 feet)? Why or why not? In the data layers explored above, I have measured the following positional accuracy: Category Layer 1 2 3 Roads Roads Roads Census2000 Tiger Arc Data Set StreetMap USA Road Centerlines (Mass GIS) 4 Hydrography 5 Hydrography Hydrography from the Census TIGER data set MassGIS Hydro 25k water bodies Approximate Positional Accuracy Relative to Orthophoto 58 feet 58 feet 0 feet (accurate alignment) 72 meters 5 meters Other than my own measurements, it would likely be difficult to retrieve additional published information about positional accuracy in these data layers, likely because it would raise questions about data layer accuracy that organizations may not have supported answers to. This being said, it makes it even more important to take it upon oneself to compare data layers for inaccuracies in a project. 5. Give a qualitative assessment of positional accuracy of each of the four optional layers relative to the other layers (e.g., do streets run through buildings? are schools in the correct location along a road?). The following is a map featuring MassGIS Land Use along with the Cambridge Orthophoto Imagery and MassGIS Major Roads. Notice that the Commercial Land Use area (red) overlaps both Main and Ames streets. For this reason it appears that the Land Use data is slightly shifted to the southeast. Furthermore, the Industrial Land Use area overlaps with the Broad Canal by approximately 11 meters. The following map features the National Land Cover Data Layer from 2001 along with the Cambridge Orthophoto Imagery and MassGIS Major Roads. Medium Intensity Development is shown in light purple, while High Intensity Development is shown in dark purple. Notice that there are great discrepancies between the National Land Cover Data Layer from 2001 and the Cambridge Orthophoto Imagery. Box 1: High Intensity Development cuts midway through a building. Box 2: High Intensity Development cuts a circular plaza. Box 3: High Intensity Development cuts through a parking lot. Box 4: High Intensity Development cuts through major roads. The positional accuracy for the National Land Cover Data Layer from 2001 appears to be as far as 20 to 60 meters off. The following map features MassGIS MBTA Subway Nodes along with the Cambridge Orthophoto Imagery and MassGIS Major Roads. All three subway stations have good positional accuracy as one can see that the nodes are located in their correct street locations. One can see this again below in a closer view of the Kendall/MIT subway station: The Kendall/MIT in fact is almost perfectly placed in this data layer, as it is not directly in the middle of Main Street but rather where it is accessed on the sidewalk where cars are parked. This is exactly how the subway is reached in reality (and the one I use for work)! The following is a map showing police stations and their proximity to the MassGIS MBTA Subway Nodes. It is interesting to see that the only two police stations in the area are placed within 286 and 153 meters from Lechmere and Central subway stops respectively. It should also be noted that Kendall/MIT does not have a police station in close proximity. The positional accuracy of these police stations can be further viewed on the following page: The police station location in the MassGIS Policy Stations Layer appears to be right on target with the marker hovering directly over the building. 6. Are these optional layers appropriate for your project in terms of their positional accuracy? Below is a table of the layers explored above. Category Layer Approximate Positional Accuracy Relative to Orthophoto Land Use MassGIS land use National Land Cover Data 2001 ~11 meters MassGIS Subway Nodes MassGIS Police Stations Positionally Accurate Land Cover Subway Nodes Police Stations ~20 and 60 meters Positionally Accurate Given the approximate positional accuracies listed above, the land use, subway node, and police station layers would be suitable for a project seeking to improve the urban transportation spaces in Cambridge. Since the MassGIS Land Cover Data Set can be between 20 and 60 meters off, I may want to utilize a different Land Cover Data Set if a more accurate one exists. Another option would be to augment this data with Land Surveys or other data published by the City of Cambridge. If these resources were unavailable and the discrepancies in the positional accuracy of the Land Cover Data Set were explained, MassGIS Land Cover Data Set could perhaps still be used for the purpose of this project. 7. Completeness: Is each data set complete? (Does it cover the area question, are all relevant features present, and is the attribute information complete for all features?) Opening up the attribute table for each of the following layers, I found the following: Category Attribute Information 1 Roads All Present 2 Roads All Present 3 Roads All Present 4 Hydrography All Present 5 Hydrography All Present 6 Land Use All Present 7 Land Cover All Present 8 Subway Nodes All Present 9 Police Stations All Present 10 Orthophoto All Present All elements were complete for each of the attribute tables in these data sets. 8. Currency: Are the data up to date? How do you know the answer to this? I retrieved the publication dates in the table below largely from MassGIS and other data layer websites. The publication dates, ranging from 2001 to 2007, give me the sense that the data sets are not updated very often, and it is rare to find a data set that is of the current year. The data sets from 2001 could certainly raise questions in terms of data currency. Category Time Period 1 Roads Publication date: January 2003 2 Roads "Enhanced” from the 2000 census Tiger roads data 3 Roads ca. 2005 4 Hydrography not listed 5 Hydrography ca. 2005 6 Land Use ca. 1999 7 Land Cover 2001 8 Subway Nodes not listed 9 Police Stations Updated February 2007 10 Orthophoto 2003 9. Attribute accuracy: provide a qualitative assessment of attribute accuracy for critical attribute items (e.g., land use codes, street names and address ranges, school names, etc). How adequate is the attribute information for your project needs? The critical attributes were present in all layers. In the above data quality assessment, the issue was not the lack of attributes, but rather the inadequate positional accuracy of several of the data sets. It is important then to keep this in mind in the context of developing a project or analysis that positional discrepancy of data can be quite common and should be noted.