Web and Data Science Center 2012-10-03 Event-based analysis of people’s activities and behavior using Flickr and Panoramio geotagged photo collections Event-based analysis of people’s activities and behavior using Flickr and Panoramio geotagged photo collections http://bib.dbvis.de/uploadedFiles/264.pdf Slava Kisilevich, Milos Krstajic, Daniel Keim, Natalia Andrienko, Gennady Andrienko University of Konstanz, slaks@dbvis.inf.unikonstanz.de Fraunhofer Institute IAIS, gennady.andrienko@iais.fraunhofer.de Abstract Photo-sharing websites such as Flickr and Panoramio contain millions of geotagged images contributed by people from all over the world. Characteristics of these data pose new challenges in the domain of spatio-temporal analysis. In this paper, we define several different tasks related to analysis of attractive places, points of interest and comparison of behavioral patterns of different user communities on geotagged photo data. We perform analysis and comparison of temporal events, rankings of sightseeing places in a city, and study mobility of people using geotagged photos. We take a systematic approach to accomplish these tasks by applying scalable computational techniques, using statistical and data mining algorithms, combined with interactive geo-visualization. We provide exploratory visual analysis environment, which allows the analyst to detect spatial and temporal patterns and extract additional knowledge from large geotagged photo collections. We demonstrate our approach by applying the methodsto several regions in the world Keywords—Geo visual analytics, geotagged images, spatiotemporal analysis, movement data, clustering Intro • Photo-sharing sites Flickr and Panoramio have billions of photos, publicly available, and annotated with metadata – Image size – Tags – Titles – Time Stamps – Geo Tags Panoramio Example What’s Interesting to Us? • User’s trajectory of sequence of photos (one user, photos close in time and space) • Which places are interesting to users (clusters of photos by geotags) • Events that are interesting to users (specific time interval and location range) • Result: We care a lot about timestamps as well as geotags. Related Work • 2008 – Flickr was used to identify regions of high tourist concentrations in Rome Photo is Figure 1 from the paper Attractiveness by Density Maps • The common and fast way to analysis of attractiveness or activity is to split a geographical region into cells, count the hits in each cell, color code the region by the count http://googlemapsmania.blogspot.com/2012/03/african-conflicts-on-google-maps.html Related Work • 2009 – Mean-shift, a non-parametric clustering algorithm used to find the most attractive places on Earth via Flickr http://mobblog.cs.ucl.ac.uk/ 30 most photographed places in Boston Convex Hulls • Density based clustering algorithms are a good way to find attractive areas. • Density connectivity between points using distance and density thresholds finds clusters of photos Analytical Framework • Formal Model O S T A 1 A2 An where O is the set of objects, S is the set of places, T is the set of moments, and A1, A2,, An are additional attributes of the events • A visual representation of the data is needed to get real value from this model • Google Earth is a good tool. • Use brushing, linking, focusing to gain inferences Data Collection • Flickr API – REST (REpresentational State Transfer) – http://www.flickr.com/services/api/ flickr.photos.search.html – Many parameters can be used to narrow the search – If you don’t narrow the search, Flickr will do it without telling you – You can only get 4000 results from any flickr.photos.search query Flickr • To get all Flickr photo metadata, you can ask how many photos in a region of the earth, subdivide it and/or time-split it until you get queries that result in less than 4000 photos • Alternate approach: Flickr has groups: given a user id, you can find others with similar interests and see what they uploaded. Flickr Search Flickr Search by Bounding Box • URL: http://api.flickr.com/servi ces/rest/?method=flickr.photos. search&api_key=36aa30d4f02fc667 22ab71cbfb40e2a8&bbox=-180%2C90%2C180%2C90&format=rest • Or: &format=json&nojsoncallback=1 JSON (JavaScript Object Notation) Time-delimited Search of Flickr URL: http://api.flickr.com/services/rest/?method=flickr.photos.se arch&api_key=36aa30d4f02fc66722ab71cbfb40e2a8&max_take n_date=2012-07-07+00%3A00%3A00&bbox=-180%2C-90%2C18%2C90&format=rest Time-delimited Search of Flickr URL: http://api.flickr.com/services/rest/?method=flickr.photos.se arch&api_key=f3794ef16c22e0e96f587fc2a190170e&min_taken _date=2012-07-07+00%3A00%3A00&bbox=-180%2C-90%2C18%2C90&format=rest Only those photos from last 12 hours Panoramio Panoramio REST User IDs run from 1 to … Panoramio by User ID User ID • • http://www.panoramio.com/map/get_panoramas.php?set=7000000&from=0&to= 10&minx=-180&miny=-90&maxx=180&maxy=90&size=mini_square {"count":2,"has_more":false,"map_location":{"lat":0.39113999999999999,"lon":36.095782,"panoramio_zoom":0},"photos":[{"height ":32,"latitude":0.39113999999999999,"longitude":36.095782,"owner_id":700000 0,"owner_name":"tincin","owner_url":"http://www.panoramio.com/user/700000 0","photo_file_url":"http://mw2.google.com/mwpanoramio/photos/mini_square/ 74216564.jpg","photo_id":74216564,"photo_title":"LakeNakuru","photo_url":"htt p://www.panoramio.com/photo/74216564","place_id":"9d45290864bf443d6d611 2685668899c315d4050","upload_date":"22 June 2012","width":32},{"height":32,"latitude":0.39113999999999999,"longitude":36.0 95782,"owner_id":7000000,"owner_name":"tincin","owner_url":"http://www.pan oramio.com/user/7000000","photo_file_url":"http://mw2.google.com/mwpanora mio/photos/mini_square/74216747.jpg","photo_id":74216747,"photo_title":"Fle mingo in Lake Nakuru","photo_url":"http://www.panoramio.com/photo/74216747","place_id":" 9d45290864bf443d6d6112685668899c315d4050","upload_date":"22 June 2012","width":32}]} Limits • Panoramio allows 100,000 queries per day • Any more, and your API ID gets revoked. • If you want all the meta data from Panoramio, keep track of the number of queries in the last 24 hours, throttle your app. Panoramio Results