2009 International Workshop on Location Based Social Networks (LBSN’09) Conceptualization of Place via Spatial Clustering and Cooccurrence Analysis Nov. 3, 2009, Seattle, WA, USA Dong–Po Deng; Tyng–Ruey Chuang; Rob Lemmens INTERNATIONAL INSTITUTE FOR GEO-INFORMATION SCIENCE AND EARTH OBSERVATION GeoInformation is increasing on the Web It’s a common activity for people to search and share geo-referenced information and resource on the Web From http://www.datenform.de/mapeng.html 11/03/2009 2 Folksonomy A tagging system allows users to classify objects of interests by keywords or terms Folksonomy = practice of personal tagging of information and objects in social environment while people consume the information and use the objects Social tools 11/03/2009 3 Tags and Geo-tags Tagging is a process that is established by keywords (k), users (u), and objects (o) Tag(k) Keywords(k) o(Object(o) associatedTo(k,o)) u(User(u) createdBy(k,u)) Geotag geo:lat=latitude e.g. geo:lat = 51.758 geo:lon=longitude e.g. geolong= 4.269 11/03/2009 4 Questions are … Is geospatial data created in a social network a valuable production for a geospatial society in general? How to extract the geospatial information from usergenerated contents in a social network? 11/03/2009 5 Places as artifacts Place is a center of meaning constructed by experiences Place may be significant to any individual or group, and may exist at any scale Locations become places only when activities occur that cause them to become imbued with meaning Place provides the conditions of possibility for creative social practice 11/03/2009 6 Photos with tags = locations with tags Tags Tags Tags Tags 11/03/2009 7 Collective intelligence Tags should give rise to emergent semantics and shared conceptualization Accumulation of tags on shared objects often express common consensus Patterns and trends emerge from the collaboration and competition of many individuals are able to turn out structured information from tag-based system despite the lack of ontology and priori defined semantics 11/03/2009 8 Photos and Tags in Flickr Tags Geo-Tag Time-Tag 11/03/2009 9 Selected photos from Flickr 11/03/2009 10 Where is the beef? 2008 amsterdam canal europe holland netherlands noordholland north travel The most frequently occurring 20% 11/03/2009 11 Steps for extracting conceptualization of place Tags crawling Tags Tags Tags database geotagged & tagged photos Spatial clustering Co-occurrence analysis Place concepts 11/03/2009 12 DBSCAN is a density-based algorithm Two global parameters: Eps: Maximum radius of the neighbourhood MinPts: Minimum number of points in an Epsneighbourhood of that point Core Object: object with at least MinPts objects within a radius ‘Eps-neighborhood’ Border Object: object that on the border of a cluster p q MinPts = 5 Eps = 1 cm 11/03/2009 13 Density-Based Clustering: Background Density-reachable A point p is density-reachable from a point q wrt Eps, MinPts if there is a chain of points p1, …, pn, p1 = q, pn = p such that pi+1 is directly density-reachable from pi Density-connected A point p is density-connected to a point q wrt. Eps, MinPts if there is a point o such that both, p and q are density-reachable from o wrt. Eps and MinPts. p p1 q p q o 11/03/2009 14 DBSCAN: The Algorithm Arbitrary select a point p Retrieve all points density-reachable from p wrt Eps and MinPts. If p is a core point, a cluster is formed. If p is a border point, no points are densityreachable from p and DBSCAN visits the next point of the database. Continue the process until all of the points have been processed. 11/03/2009 15 Density-Based Clustering: Results 11/03/2009 16 Co-occurrence analysis Co-occurrence can be interpreted as an indicator of semantic similarity or an idiomatic expression. Co-occurrence assumes interdependency of the two terms Semantic similarity is a concept whereby a set of documents or terms within term lists are assigned a metric based on the likeness of their meaning / semantic content. 11/03/2009 17 Co-occurrence matrix The element at (i,j) is the tag count or frequency of the i’th tag in the j’th photos dj x1,1 x1,n T ti xm,1 xn ,m 11/03/2009 18 Co-occurrence matrix A row in the matrix is a vector of the tag’s occurrence in all photos: t [ xi ,1 xi ,n ] T i While a column is a vector of the occurrence of all tags in a photo x1, j dj xm , j 11/03/2009 19 Co-occurrence correlations tag-tag correlation matrix Photo-tag matrix 11/03/2009 20 The correlation between the tag “amsterdam" and the tags of several landmarks associated to Amsterdam Correlation coefficient Distance 11/03/2009 21 Conceptualizing places in 2500 meters 11/03/2009 22 Conceptualizing places 150 meters 11/03/2009 23 Conceptualizing places in 75 meters 11/03/2009 24 Schiphol airport 11/03/2009 25 Anne Frank House 11/03/2009 26 Rijksmuseum 11/03/2009 27 Conclusions and future works Without the use of suitable spatial clustering, detailed information about a place is veiled by high frequency tags A conceptualization of place is unveiled by tag cooccurrences at a suitable spatial scale Location-based applications can be developed to suggest tags to users as they take photos In the future we will ground the semantics between pairs of tags via the use of gazetteers or dictionaries 11/03/2009 28 Thank you for your attention! Dongpo Deng deng@itc.nl INTERNATIONAL INSTITUTE FOR GEO-INFORMATION SCIENCE AND EARTH OBSERVATION