DATA SCIENCE MIS0855 | Spring 2016 Identifying Sources of Data SungYong Um sungyong.um@temple.edu Data can come from anywhere Where data comes from Companies and other organizations Services that compile data (i.e., Amazon, Yelp, KBB) Government agencies How it is accessed Direct access to databases and spreadsheets Reporting tools and user interfaces Application Programming Interfaces (APIs) APIs – (More) securely sharing data API Database full of data! The API is software that exposes internal data to the outside world. People write programs that “talk” to the API so it will fetch data for them. Keeps people from accessing the data directly, keeping it safe. Four Steps of Turning Data Into Information Collection Organization Nate Silver applies this to journalism; what are the wider applications? Explanation Generalization fivethirtyeight.com/features/what-the-fox-knows Think back to our discussion about science… The FiveThirtyEight.com Burrito Bracket Why was the Yelp data a good place to start? What is the VORB (Value Over Replacement Burrito) statistic? Function of quality and quantity. Why is that important? What adjustments had to be made to the data? What lessons are there for using “public” data? chipotle.com Open Data Uber trips from Uber data Crime rates from SF Crimespotting What is open data? How can open data transform: Science? Government? Journalism? Business development? What are the arguments for and against open data? http://blog.uber.com/2011/09/13/uberdata-how-prostitution-and-alcohol-make-uber-better/ From Data Dredger: http://www.internewskenya.org/dataportal/