Betsy McDonald UEP 232 – Intro to GIS Assignment 4 – GIS data quality assessment October 18, 2013 Introduction: For this assignment, I will be evaluating data sets for Watertown, MA in light of growing concern over the future impacts of climate change. In particular, city officials want to assess Watertown’s emergency preparedness plans for an extreme weather event (flood, hurricane, tornado, extreme heat, etc.). With the assumption that an event could happen at any time (and that they will happen more frequently in a changing climate) and that the existing plans have not been revisited recently, the City has decided to reevaluate and update their response plans. As a result, my firm has been hired to assist the City of Watertown reexamine emergency preparedness. In particular, the following factors will be considered: street centerlines, city hydrography, open space, public and private schools, fire and police stations, and grocery store locations. The analysis below assesses the accuracy of data sets for those factors that would help in assessing emergency preparedness plans. For this particular analysis, street centerlines, hydrography, and the location of schools and fire and police stations need the most accuracy, while open space and grocery locations can tolerate a greater degree of inaccuracy. All of the Figures included in the analysis use the Imagery data set as a basemap. Required Data Layer: Roads Roads and street centerlines are critical to providing prompt and accurate emergency response (and if need be, evacuation) routes. The first data set used in this analysis is the Census2010 TIGER roads data set available via the MassGIS folder on the Tufts GIS data server. The TIGER centerlines are displayed in the images below as a dark red line. The second data set used is the ESRI Streetmaps (the streets file from ESRIDataMap10) also available via the Tufts GIS data server. The ESRI centerlines are displayed as green lines in the images below. The metadata did not list specific values for positional accuracy, and therefore I relied on visual comparisons against the underlying orthophoto. At first glance, the two sets of street centerlines appear to be fairly consistent in many places. Figure 1 illustrates that both centerlines are almost exactly lined up along Mount Auburn Street (the green overlays the red so closely that the red line almost isn’t visible). This image also displays good street connectivity, another important factor for storm and extreme weather preparedness and response planning. Figure 1. ESRI streets in green, TIGER streets in red While many of the streets in the two datasets align very closely with each other and with the centerlines visible in the aerial image, there are points that were not so consistently aligned. As seen in Figure 2 below, there are at least two distinct locations for which both datasets draw roads on top of buildings. As well, Figure 2 demonstrates where the TIGER data drew a road where there was not supposed to be one. Figure 2. ESRI streets in green, TIGER streets in red The centerlines had a range of consistencies in some of the more complicated intersections, including Watertown Square. As seen in Figure 3 (Watertown Square), the TIGER streets meet at a central point and break off from that one location (circled in red), whereas the ESRI streets intersect/connect in various places throughout the square (circled in green). Following both data sets from Watertown Square down Charles River Rd, the centerlines vary by as much as 6 meters (approximately 20 feet). To the north and east of Charles River Road, along North Beacon Street, Arsenal Street, and Riverside Street, the centerlines align very closely. As well, while the TIGER visualization of Charles River appears to match the aerial centerline more closely, the ESRI visualization of Mount Auburn Street is closer to the aerial centerline. This snapshot of a complicated intersection demonstrates a range of variability in accuracy between the data sets. Figure 3. ESRI streets in green, TIGER streets in red TheTIGER street data has a greater degree of detail/completeness (contains driveways and some marginal streets), whereas the ESRI dataset does not contain these details. Both data sets have similar currency, both have degrees of positional inaccuracy in complicated street intersections but are otherwise generally consistent, and both seem to have critical attributes included. However, the higher level of detail in the TIGER data would lead me to choose the TIGER data set over the ESRI data set. Although the TIGER street data placed one street where it does not really exist (as noted above), and while this inaccuracy should definitely be noted for purposes of this project, it should not outweigh the depth of detail provided by the dataset. Required Data Layer: Hydrography Should there be an extreme weather event that impacts water levels in the town, it would be important to know where major water bodies are located in order to assess potential flood risk. To that end, I am using two different data sets for hydrography and comparing them to each other, as well as to the base orthophoto imagery, to determine suitability for this analysis. Part of the hydrography analysis is based on data from the MassGIS hydro25Kpoly, which was last updated in March of 2010. The Hydro25KPoly polygons are represented in Figures 4, 5, and 6 with a dark blue outline. Per the metadata available from MassGIS, the layer is an enhanced version of the 1:25,000 USGS Hydrography data layer (indicating accuracy of approximately +/- 40 feet). Among other hydrographic features, the data layer contains information on surface water, wetlands, bogs, flats, rivers, streams. The type/description of the water source is an important attribute in terms of preparedness planning, but positional accuracy of the data would perhaps be more important. While the data set generally identifies most of the water bodies in the city, they are rough outlines and often appear to be located slightly to the northwest of where the water body appears on the orthophoto. In Figure 4 below, the hydro25Kpoly data has identified a pond, but the southern boundary is approximately 16 meters (or approximately 52 feet) shy of the water boundary as it is portrayed in the orthophoto. The northwest portion of the hydro25Kpoly shape also extends onto what appears to be land by about the same distance. Figure 4. Dark blue outline is from MassGIS, light blue fill is from the City of Watertown The second data source is the water_parcels datalayer from the City of Watertown (accessed on the Tufts GIS data server). When this datalayer was pulled into the frame, I received an error message indicating that the data source was missing spatial reference information and could not be projected, which is important to note in terms of accuracy for this project. For the project, I would need to reach out to the GIS team at the city level to determine spatial reference information. Absent this information, and also without access to metadata for this layer, the best assessment that I can make on this data is a visual assessment comparing it to both the orthophoto image and the data provided by MassGIS. Many of the smaller water bodies visible in the orthophoto are not identified by the City of Watertown data. As an example, Figure 5 highlights one such area, Sawins Pond. The MassGIS Hydro25K data set roughly outlined this area, but the City of Watertown water_parcels data did not identify it, rendering the water_parcels data less complete. Figure 5. Dark blue outline is from MassGIS, light blue fill is from the City of Watertown Figure 6 shows different visualizations of the Charles River. While the hydro25k poly seems to place some water on what actually appears to be land, it also extends much more broadly than the Watertown water_parcels data. Figure 6. Dark blue outline is from MassGIS, light blue fill is from the City of Watertown Overall, while there are definitely inaccuracies with the Hydro25K poly data, I think it would still be a better fit for this particular project. Without metadata for the water_parcels datalayer for the City of Watertown, I can’t assess currency or positional accuracy. I feel more comfortable moving forward with the data provided by the MassGIS hydro25k data set, as it at least identifies the smaller water bodies to some degree, and although attribute data is somewhat limited, it appears to at least have more data and is a more complete source of hydrography data than the Watertown water_parcel data set. Optional Data Layer 1: Open Space To assess the adequacy of open space data in Watertown, I used the MassGIS OpenSpace poly. The metadata indicates that the data layer’s metadata was last updated in 2012. Positional accuracy metadata was not available. The data layer description provided by MassGIS notes that while the OpenSpace layer is generally useful for statewide and regional planning purposes, it does not include attribute information such as legal records of ownership, and the parcel images are not actually based on property surveys. Rather, the metadata indicates that data is collected by volunteers, and that accuracy of data can be variable. It does contain an attribute labeled “fee_owner”, which may be a somewhat reliable designation of who owns the property, but might not be entirely accurate per the note in the item description that states the data layer does identify/verify legal ownership. The open space layer does, however, include information about whether or not the parcels are accessible to the public– this would be an important attribute in identifying how the land could be used in an emergency. Based on a visual comparison to the orthophoto basemap and my working knowledge of the area as a former resident, the OpenSpace data layer appears to be fairly complete. Large parcels, including parks, golf courses, and open fields are generally covered by the data layer. That being said, it incorrectly identifies several buildings as open space. In Figure 7a, which is just the aerial view without the OpenSpace overlay, you can see several buildings enclosed in a red box. Figure 7b has designated that land area as open space, when clearly it is not. A similar visual is presented in Figure 8a and 8b. Figure 7a. Figure 7b. Red box in Figure 7a highlight some of the physical structures lumped into the “open space” designation in Figure 7b. Figure 8a. Figure 8b. Red boxes in Figure 8a highlight some of the physical structures lumped into the “open space” designation in Figure 8b. Depending on the planned use for open space, the incorrect inclusion of a limited number of buildings in open space data will not likely be a huge issue, but the analysis could benefit from a city level verification of open space parcels. Data Layer 2: Schools School locations would also be important in terms of emergency preparedness. Responders (and residents/family members) will need to be able to locate where students and teachers are at the time of an extreme weather event. Properly identified schools could also possibly serve as safe shelter locations or other public meeting spaces if property elsewhere in town has been destroyed or is without electricity. To identify the location of schools in Watertown, I used the MassGIS Schools_Pt data layer. According to the metadata, the point data layer was originally established by the MA Department of Environmental Protection’s GIS program with data from the MA Department of Education. As described on the MassGIS website, the DEP’s horizontal accuracy goal for point locations is +/-16.4 feet (although the website emphasizes that this is a goal and not a standard, and that most DEP point locations are within +/-100 feet). In general, either of these standards would be sufficient for purposes of this project – we care that the school exists in the general location (and that if someone was in front of it, they could spot it visually). The data layer was created in 2009 and last updated in 2010. The data layer includes relevant attribute information for emergency planning purposes, including whether or not the school is public, private, charter, collaborative, or special education. It also includes mailing address, contact phone and fax numbers, as well as the name of the principal for the school. While some of the school principals’ names may have changed since 2010 (and probably should be verified anyway for emergency planning purposes), this information seems generally appropriate in terms of currency (addresses appear to be up to date). The data for public schools appears to be complete based on my own working knowledge of the Watertown school system and according to the Watertown Public Schools website. For purposes of this project, the data layer seems to be sufficient in terms of positional accuracy with one exception: the Perkins School for the Blind is identified on the map as a single point, while there are multiple buildings on this campus that would need to be accounted for in an emergency situation. As seen in Figure 9 below, one building on the school’s campus is identified in the data layer with an orange triangle. The red arrows indicate other buildings on campus that are not explicitly identified by the data layer. For emergency planning purposes, we would need to know which buildings belong to the school. Figure 9. Optional Data Layer 3: Fire and Police Stations Fire and police organizations will play a critical role in responding to any emergency or extreme weather situation. For this part of the assessment, I used the POLICESTATIONS_PT_MEMA and the FIRESTATIONS_PT_MEMA datalayers from MassGIS. The data was created by the Massachusetts Emergency Management Agency (MEMA) GIS program, with the help of regional planning agencies and individual communities. The data layer’s metadata claims positional accuracy to the building at 1:5000 or better. However, the metadata also indicates that both police and fire station locations were last updated in MassGIS in 2007. The datalayer seems to be complete in that the total number of stations listed on the Watertown Police and Fire Department websites appear in the data. However, the positional accuracy is not entirely correct. Figure 10. While the Watertown Fire Department stations (three stations, with the primary location indicated in blue on the map above) are cited in correct locations, the Watertown Police Department moved to a new location around 2010 and is no longer located where the data displays it. The station moved 0.8 miles northwest of the old location on Main Street. For emergency response planning purposes, this would be unacceptable—the data is neither current nor is it accurate as it does not include the change in location. Additional attributes – including the name of the Police Chief, for example, could also make the attribute table (and the overall value of the data set) more informative. Optional Data Layer 4: Grocery Stores and Convenient Stores Finally, grocery and convenient stores could be an additional factor in emergency preparedness and response, and the correct identification of these facilities could play an important role in making sure that citizens have access to the food resources that they need in advance of or during an extreme weather event. To determine the locations of grocery and convenient stores within city limits, I pulled in X and Y coordinates from Reference USA. The data has not been geocoded against address points. The data set contained relevant attribute information including address, primary name, description, longitude and latitude. It also noted whether the record type had been verified. However, I could not find information about currency for this data set, so some of the information may be out of date. Based on a visual analysis, most of the major grocery stores were generally identified (some data points were in the street in front of the stores, but this level of accuracy is fine for this project, as a grocery/convenience store could be identified from the street). Although not a grocery store exclusively, a Target store was included in the data set. This particular Target does have a grocery section, and could be an important resource in terms of preparing for extreme weather situations and providing resources to people in need. The Reference USA data point for this Target seemed to be one of the least accurate placements. In Figure 11 below, the green triangle is where the Reference USA data point placed the store. The building surrounded by a red box is Target’s actual physical location. However, for purposes of this project, it seems reasonable that as long as the shopping plaza itself was identified, this level of inaccuracy would be tolerable (although not ideal). A next step of the analysis would include geocoding against address points – it would be interesting to see whether the address points were more accurate. Figure 11. In addition to these small position inaccuracies, the data set also appears to be incomplete. One store in particular that I had expected to see was Russo’s Market – a well-known produce market in the city. It was not included in the Reference USA data. Figure 12 is a screenshot from google maps which verifies that Russo’s Market currently exists event though it was not included in the Reference USA data. Figure 12. Russo’s Market is not included in the Reference USA data set. With some of these inconsistencies in mind, a local survey of grocery resources would also be important to supplement the Reference USA data and would help make the analysis of food/grocery resources for emergency situations more complete.