Report - New Maps - University of Kentucky

advertisement
Documenting and Digitalizing Health Inspection Scores of the Restaurants of
Lexington, KY: a collaborative project between University of Kentucky
Geography/GIS students and OpenLexington, a local organization advocating open
and transparent government data.
University of Kentucky students:
Dylan Powell (dy.pow311@gmail.com)
Preston Evans (evans.preston@gmail.com)
OpenLexington:
Chase Southard (chase.southard@gmail.com)
Chris Stieha (stieha@hotmail.com)
www.openlexington.org
Table of Contents
•
Project Summary
•
Needs Assessment Report
•
Progress Report
•
Data Dictionary
•
Final Maps
•
Conclusions
Project Summary
In collaborating with the local government data transparency group, OpenLexington, we
hoped to collect a fairly comprehensive catalogue of most of Lexington’s restaurants, their
corresponding health inspection scores, and additional data that would allow us to critically
analyze the collection and inspection methods of the Fayette County Health Department. Using
‘smartphones’ and free website available to the public, we wanted to collect, organize, and
display this health inspection data in interesting and informative ways, and show that the current
methods the Fayette County Health Department uses to collect their health inspections could be
updated to more modern electronic methods with many more practical applications. Finally, we
wanted to release our collected and polished data to the public free-of-charge, in an effort to
show the many purposes ‘open data’ can serve.
Beginning with the initial data collection, we produced a template for a ‘smartphone’based health inspection form that focused on the name, address, latest health inspection score,
and key health violations for Lexington restaurants, along with the corresponding comments for
the individual violations. The data was gathered by the students of UKC101, an introductory
course in digital mapping at the University of Kentucky taught by Dr. Matthew Wilson. Once
the data was compiled, we worked to organize and standardize the data so it was usable by our
mapping software, and in the process, came to better understand the advantages and
disadvantages of technology-based primary data collection. Once the dataset was organized in a
uniform format, we used a combination of GIS software (ArcGIS by Esri) and free web-based
mapping applications (Geocommons by GeoIQ) to display the data in different ways, seeking to
find patterns between the scores, types of food served, average household income, and many
other factors.
Needs Assessment Report
1. Project Background Information
OpenLexington is a non-profit group, based in Lexington, which facilitates means of data
collection and organization outside of governmental control and taxation. Created in 2010 by
Chase Southard, Chris Stieha, and other like-minded individuals, their goal is to make such data
open to the public for use in developing applications and software that can then be freely
distributed to the average interested citizen. They have already worked together with local
government and citizen groups to broaden the amount of ‘open data’ in Lexington, and have
long-term goals of working with government departments and employees to both streamline data
collection and make the data open to general users. This data will eventually be presented to a
group of Lexington government officials, local GIS workers, and interested citizens at CityCamp
Lexington, an ‘unconference’ seeking to assess and find solutions for the large amounts of local
data that are still not readily available to the public.
2. Goals and Objectives
In the most general terms, our project will be looking at the topic of closed-data
collection practices in local government, and how best to work with the agencies to make the
data open to the public for use and easier to organize through digitalization of the data collection
process. In the case of our biggest project this semester, we will be helping OpenLexington
develop an application for Smartphones that allows users to easily browse the health inspection
scores and violation numbers of local restaurants in the Lexington area. The application will be
open source and free to anyone interested.
This process starts with the collection of data via the application and website EpiCollect,
which we have used to create a data collection form that fits our needs and provides the
necessary fields for data entry. The form is designed by dragging and dropping entry fields that
can have either binary or multiple values, and labeling the entry fields as needed. We designed
the form to follow the format of an actual Fayette County Health Inspection form, with entry
fields for the names, addresses, scores, and violations of individual restaurants, and extra fields
provided for the comments associated with each violation. We realized that having an entry field
for every possible violation would be excessive and hard to maintain, so we chose the ‘key’
violations, which are those that require a follow-up inspection within a set amount of time to
make sure the violation has be fixed. After canvassing the restaurants of Lexington, the
collected data will be organized and polished into a user-friendly format, using Microsoft Excel
to produce a spreadsheet of the data. Finally, we will use GIS software to spatially represent this
data as clearly as possible, and the finished application and spatial data may be presented to the
local Health Department as an alternative to their current pen-and-paper methods of inspections.
3. Data Acquisition and Preparation Steps
All the data we will be collecting for the application will come from the primary source,
the posted inspections that are required to be visible to the public inside every restaurant. To
keep the data streamlined and avoid clutter, we will be gathering the names, addresses, type of
food served, most-recent health inspection scores, and violation types of the restaurants we visit.
The canvassing will be done in phases by small groups of students using the EpiCollect form we
have created, and the data will be immediately available to work with as soon as it is uploaded
from the EpiCollect Smartphone application. Once all the data is collected, we can import it into
Microsoft Excel, and then format it to be usable by ArcGIS.
4. List of Maps and Analyses
Our biggest task in the project will be the collection and organization of all the data we
acquire, as we hope to have many, if not most, of Lexington's restaurants catalogued with data
we can then use in ArcGIS. The categories of data that we will collect using the EpiCollect form
we created are the restaurant name, address, zip code and a list of the major violations. By the
end of the project, we will have used ArcGIS to create an interactive map that can display the
collected restaurant data with the ease of a click, and we hope to work with OpenLexington to
turn this kind of map into an application that could be used on-the-go with ‘smartphones’. This
is the eventual product that could be shown to the Health Department as a technological
alternative to their traditional paper health inspection forms.
5. Steps Required
A) Create the necessary EpiCollect form for review by OpenLexington (Preston)
B) Write the Needs Assessment Report (Dylan)
C) Organization of student groups for canvassing (Dylan)
D) Monitor data collection as it is happening to ensure quality data (Preston)
E) Create a Progress Report to show what has been done (Preston/Dylan)
F) Wrap-up data collection
G) Format all collected data in Excel for use with ArcGIS (Preston/Dylan)
H) Use ArcGIS to produce a map that represents our data (Preston/Dylan)
PROGRESS REPORT
Completed Parts of Project:
•
Needs and goals of community partners determined.
•
Set up an EpiCollect form for data collection.
•
Divide student volunteers into groups for data collection.
•
Data collected and submitted through EpiCollect server.
•
Data aggregated into an Excel spreadsheet.
•
Problems Preventing Completion
•
Missing data fields on spreadsheet.
•
Incorrect dates of last inspection.
•
Geocode data with missing latitude/longitude coordinates.
MID-PROJECT MEETING
During the mid-project meeting, we met with Chase Southard and Chris Stieha and discussed the
current state of the project. We talked about our original goals with the project, and made sure
everyone was still thinking along the same lines. We agreed that the primary goal was to get the
collected data completely organized so that we could start importing it into ArcGIS and creating
maps, and also so we could put it on the internet and truly make it ‘open data’. We also talked
about interesting visual and spatial representations we could use to display the data, and assessed
whether these goals were manageable.
CHANGES TO PROJECT GOALS
After the mid-project meeting, we didn’t make any large changes to our original goals of creating
a refined and polished dataset, and representing the data in interesting ways that could be
possibly used to convince others of the value of ‘open data’.
Data Dictionary
3/6/12---Cleaned up addresses, corrected capitalization, made street format uniform (Dylan)
3/6/12---Corrected false addresses using Google Maps, filled in info for missing addresses (Dylan)
3/8/12---Formatted dates as much as I could (multiple cases of clearly wrong dates (4/7/4725) and
missing data (3/ /2012)). We’ll need to follow up on these. (Dylan) (Dr. Wilson thinks it may be that they
simply entered the date in the wrong order, and Excel can’t recognize it, since it was a simple text input
box on the EpiCollect form. We can email TA’s for specific student groups, since they should all have
hand-written notes)
3/20/12---Corrected names of restaurants and general misspellings (Dylan)
3/22/12--- Cleaned up the violations. Added N/A to entries that were not selected. Changed Null values to
no plus the violation number- (Preston)
3/27/12--- Cleaned up the violations. Added N/A to entries that were not selected. Changed Null values to
no plus the violation number, Cleaned up type of restaurant column, Cleaned up the rest of the dates, Put
N/A on dates that I could not discern from the photo or where the photo did not exist.
4/2/12--- Simplified the types of restaurants into bigger categories to allow easier filtering and
symbolizing, changed violation assertations from “Yes01” format to “yes” format. This allows easier use
with Geocommons, and less confusion in the data. (Dylan)
KEY TERMS AND VIOLATIONS FOR DATASET
-latitude/longitude/altitude= The respective spatial data for the restaurant.
-photo= Link to the photo of the health inspection form of the associated restaurant (some links are dead).
-Name/Address/Zip/City/State= Name of the restaurant and its address in Lexington, KY.
-Score/Date= The health inspection score the restaurant received and the date that it was last inspected.
-Type/Sub-Type= The general type of food served at the restaurant, and a more specific description of the
food if applicable/available.
-Vio##/Vio##Desc= The key violations we analyzed (‘yes’ if violated, ‘no’ if not), and the description of
the specific violation at the restaurant.
-Ageofformdays= The number of days that has passed since the restaurant received its last health
inspection.
Sample Fayette County Health Inspection on next page.
-Critical Violations (the ones we collected data for) are listed in bold text. These include:
Violation 1: SOURCE, CONDITION, NO SPOILAGE
Violation 3: POTENTIALLY HAZARDOUS FOOD – SAFE TEMPERATURE
Violation 4: FACILITIES TO MAINTAIN PRODUCT TEMPERATURE
Violation 7: POTENTIALLY HAZARDOUS FOOD NOT RE-SERVED
Violation 11: PERSONNEL WITH INFECTIONS RESTRICTED
Violation 12: HANDS WASHED AND CLEAN, HYGIENIC PRACTICES
Violation 20: SANITIZATION RINSE CLEAN, EQUIPMENT AND UTENSILS SANITIZED
Violation 27: WATER SOURCE, SAFE, HOT AND COLD UNDER PRESSURE
Violation 28: SEWAGE AND WASTE DISPOSAL
Violation 30: CROSS CONNECTION, BACK SIPHONAGE, BACKFLOW OF PLUMBING
Violation 31: NUMBER AND ACCESSIBILITY OF TOILET AND HANDWASHING FACILITIES
Violation 35: INSECTS/RODENTS, OUTER OPENINGS PROTECTED, NO ANIMALS (BIRDS/TURTLES)
Violation 41: TOXIC ITEMS PROPERLY STORED, LABELED, USED
-The corresponding number of points the violation is worth can be found to the right of the violation
itself.
-The overall score is listed in red or green in large print on the form, green implying an acceptable score,
and red indicating multiple critical violations.
-In order to provide open access to the EpiCollect form we created, the direct link
http://epicollectserver.appspot.com/project.html?name=LexHealthProject will take you to the EpiCollect
page for our form, then login using openlexingtonproject@gmail.com as the username, and ge0graphy
(with a zero instead of an “O”) as the password.
-Finally, the dataset has been uploaded to www.geocommons.com, where it can be found by searching for
“LexHealthScores” or using key search terms such as Lexington/health/inspection/scores.
-Direct link to Geocommons map of data: http://geocommons.com/maps/161675#
-The map is open for editing to the public, and can be filtered by any combination of desired filters (e.g.
Type, Score, Name, Address, Violation)
-A few of our maps utilize a visualization tool called ‘kriging’. These maps use this technique to
interpolate values of certain characteristics (in our maps, inspection score and date of last inspection) for
unknown locations, based on the values of known locations.
OUTPUTS
The following images are the maps we have produced using the dataset we collected.
The titles indicate the values that are being compared in the map, and legends and scales for each
map have been provided.
We have also created an online map using the dataset through Geocommons, a free webbased mapping tool and database for spatial data. The map is open to editing, but can also be
embedded on a website by using the following embed code:
<style>#geocommons_map_161675 {width: 100%; height: 400px; position:relative;}</style>
<div class="geocommons_map" id="geocommons_map_161675"></div><br/>
<a class="geocommons_map_link" id="geocommons_map_161675_link"
href="http://geocommons.com/maps/161675">View map on GeoCommons</a>
<script type="text/javascript" charset="utf-8"
src="http://geocommons.com/javascripts/f1.api.js"></script>
<script type="text/javascript" charset="utf-8">
var geocommons_map_161675 = new F1.Maker.Map({map_id: "161675", dom_id:
"geocommons_map_161675"});
</script>
Conclusion
Through the process of completing a GIS-based project, from the primary data collection
to the final maps created using the data we acquired, we have learned many things about the
advantages and disadvantages of electronic primary data collection, and in specific regard to our
data, about the spatial and temporal patterns of local health inspections of restaurants in
Lexington, Kentucky. In terms of electronic primary data collection, we have learned that data
which has traditionally been collected and filed non-electronically can often be collected and
made readily available to the public with greater ease if it collected using some type of electronic
form. Not only are these forms fairly simple to produce, but they also have the potential to be
altered for specific purposes and uses, unlike paper copies. When the data has been collected
and entered onto the form, it can be immediately uploaded to a server that can collect and
organize the data automatically, and then store the data on a public server that can be accessed by
anyone with a computer and internet connection. This makes the data more available for use to
not only the government employees in the department that the data concerns, but also to
interested local citizens who can use the data, either in computer or phone applications or various
visual representations for their own purposes. In its current non-electronic form, the same data
must be acquired through the department that collected the data using pen-and-paper methods,
and often is found in a very rudimentary and inflexible data format that cannot be readily used by
either employees or citizens. In addition, the data is only ‘easily’ available to those who can
physically go to the department, which is often in Frankfurt, Kentucky, making it very difficult
for those who don’t live relatively close or lack sufficient transportation.
One of our biggest goals with this project is to show how 'open data' can empower
average citizens to be able to create useful and interesting things, simply by giving citizens easily
available access to datasets like ours. By making maps with our data, we were able to find many
interesting patterns in Lexington's restaurants and neighborhoods. We compared aspects such as
average household income of a census tract, types of restaurant most prevalent in an area,
densities of restaurants and their associated scores, and many other factors. Using the free webbased tool Geocommons, we were able to upload our dataset to the internet and create a userfriendly, interactive map that allows average people to filter the restaurants of Lexington by
many different factors, such as score, individual violations, and even by how much time has
elapsed since the restaurants last inspection. Finally, we presented all of our methodology, data,
and outputs to interested citizens and government workers at CityCamp Lexington, a local
'unconference' to encourage making Lexington government data more readily available to people
who can use the data in software for the benefit of the public.
Download