Nnenna Okoye Data Accuracy

Nnenna Okoye
A Visitor’s Guide to Getting to the 44th Head of the Charles Regatta High School Alumni
1. I have been invited to attend my high school reunion taking place at the Charles
Regatta,the world’s largest rowing event . Since 1965, rowers have come from around the
world to the banks of the Charles River to compete for the gold or/and the experience.
The high school reunion will take place at a large tent called “Reunion village” that offers
breakfast and lunch concessions, a dining tent and big screen TV to watch live racing
footage. Here are the reasons for why I think accuracy of data is important:
a. Supporting spectators need to know the location of the river’s viewing sites and
how far away each viewing site is from a MBTA stop.
b. It would be very important for an event coordinator to distribute the most accurate
maps showing how visitors can navigate their way from an MBTA stop to
Reunion Village.
Although positional accuracy is not a life-threatening issue, clear directions to river sites
and reunion tents is important to ensure a great, relaxing event for Bostonian and foreign
visitors. (Nobody likes getting lost…) For many visitors, this is one of the few or only
chances they will have to support their friends or loved ones participating in the Regatta,
so it’s important that visitors are equipped with the best directions and don’t have to
needlessly develop anxiety over bad directions.
2. When compared to ARCGIS Online Imagery, the most accurate street dataset in
reference to ortho imagery is the Cambridge’s database. ( Please see image below)
Judging from the accuracy of the alignment to the ortho imagery, it appears that the
Cambridge GIS road centerlines data (the orange lines) was created according to the
ortho imagery. The Census data (in pale blue lines) is about 69 feet off from the
centerlines of the ortho streets, so centerlines actually intersect the middle of the
buildings. The Census data also makes it erroneously appear as if there are streetlines
located within the Charles River-and I don’t think they expect people to walk on water.
There were also some instances in which Census Data led directly through buildings and
much of the Streetmap USA (in red) was almost perfectly aligned to Census Data.
3.) When compared to ARCGIS Online Imagery, the most accurate data source is Boston rather
than from Cambridge. I concluded this because the Boston hydrography shapefile extends farther
than the Cambridge shapefile. So, if % of coverage was the only criteria for a shape file, the
Boston hydrography layer would likely win. But in my particular scenario, this should not be an
exclusive criterion because it’s not really necessary for my project since the Charles Regatta
covers only a short river distance that is both covered in the Boston and Cambridge Data.
In the picture above, I tried to provide a visual assessment of my relevant data layers. I decided
to focus on the street layers because the parcel layers that I used didn’t really relate to my
Regatta Map. As the yellow lines showed, all the lines that are supposed to represent center road
lines do not align to each other. As in the previous picture, the thick, pale blue lines represent the
Cencusroad, the thin green lines represent a subset of the census layer lines that delineated the
major roads in Cambridge an the orange line was the “RoadCenterLines” layer from the City of
Cambridge. The census lines were about 62-82 feet away from the road centerlines offered by
the aerial photo. Defining the positional accuracy of the MBTA station is tricky because the
station covers such a large space, so the actual main entrance of the MBTA station is the yellow
dot although the GIS MBTA layer added the red dot nearby.
5.) The Census lines (pale light blue lines) streets run through many buildings. As an example,
the line in the pink box shows a line that runs through a side of the building. The MBTA stations
are relatively in the correct position it appears. I chose a City Boundary Layer, and a Boston or
Cambridge parcel layer for my other three additional layers that all showed the outlines of both
6.) The MBTA layer and the hydrograph layers are actually very relevant to my project since it
pertains to visitor getting from the MBTA station to the river bank. I don’t think the City
Boundary Layers were useful layers. I originally included them because I have used them for all
the other map projects we did and I formed the assumption that I would use them for every
project. But this assignment made me realize that I should think more carefully about which
layers I select to add to my map, because even “staple layers” such as parcel information are not
always relevant.
7..) The attribute table for MBTA station is an example of the table being complete enough for
this project but it might not be sufficient enough information for other projects. See the table
For this specific project, this map showed all the information I needed : the name of the station
and which line it was.
The Census Layer was interesting because at first it appears that its open attributes table was
incomplete, as shown in the figure below.. But then I recalled that the
“CENSUS2000TIGER_ARC” actually subdivided into 4 layers.
Thus, one of the subsets of this main Census layer, called “Road Source” (shown in orange in
previous pictures) detailed the streets that were missing in other layer. So in this case, while the
“main attribute table” appeared to miss some minor details, it seems that the data authors might
have intentionally disaggregated the street information into four layers to distinguish major road
from sidewalks, alleys or.
8.) There were two methods to use to look up the timing of publication: via metadata or title file.
While title file may not be the most accurate way to determine publication, it did allow me to
assign a certain time bracket for which the layer was formed. For the MBTA layer, the metadata
does explicity state how current the material is and the creator of the data Central Transportation
Planning Staff in January 2006. The City boundary file for Cambridge and the Parcel Data from
Boston however did not have complete metadata and neither included the publication time or
author. The data file called “CENSUS2000TIGERHYDRO_POLY” also did not have any
information associated with it. he Cambridge Parcel metadata did not include any offer but the
layer file was called FY2006 so I could assume it was made in 2006. The Mass GIS metadata
was formed in February 2003.
The attribute items of my street and MBTA station layers were clear and quite comprehensive,
especially since the Census layer subdivided each information into major streets, road center
lines and small neighborhood allies. However, the water layer did not offer information that I
could clearly translate into relevant data. This is perhaps because my knowledge in how to use
this layer is quite poor. I didn’t really understand, for example, why each section of the river had
an object ID number associated with it, especially since it’s one river. But perhaps, it allows
people to measure the area of one particular section they are interested in.