Conceptualization of Place via Spatial Clustering and Co- occurrence Analysis

advertisement
2009 International Workshop on Location Based Social Networks
(LBSN’09)
Conceptualization of Place via
Spatial Clustering and Cooccurrence Analysis
Nov. 3, 2009, Seattle, WA, USA
Dong–Po Deng; Tyng–Ruey Chuang; Rob Lemmens
INTERNATIONAL INSTITUTE FOR
GEO-INFORMATION SCIENCE AND
EARTH OBSERVATION
GeoInformation is increasing on the Web
 It’s a common activity for people to search and share
geo-referenced information and resource on the Web
From http://www.datenform.de/mapeng.html
11/03/2009
2
Folksonomy
 A tagging system allows users
to classify objects of interests
by keywords or terms
 Folksonomy = practice of
personal tagging of
information and objects in
social environment while
people consume the
information and use the
objects
Social tools
11/03/2009
3

Tags and Geo-tags
 Tagging is a process that is established by
keywords (k), users (u), and objects (o)
Tag(k)  Keywords(k)
o(Object(o)  associatedTo(k,o))
u(User(u) createdBy(k,u))
 Geotag
 geo:lat=latitude e.g. geo:lat = 51.758
 geo:lon=longitude e.g. geolong= 4.269
11/03/2009
4
Questions are …
 Is geospatial data created in a social network a
valuable production for a geospatial society in
general?
 How to extract the geospatial information from usergenerated contents in a social network?
11/03/2009
5
Places as artifacts
 Place is a center of meaning constructed by
experiences
 Place may be significant to any individual or group,
and may exist at any scale
 Locations become places only when activities occur
that cause them to become imbued with meaning
 Place provides the conditions of possibility for
creative social practice
11/03/2009
6
Photos with tags = locations with tags
Tags
Tags
Tags
Tags
11/03/2009
7
Collective intelligence
 Tags should give rise to emergent semantics and
shared conceptualization
 Accumulation of tags on shared objects often express
common consensus
 Patterns and trends emerge from the collaboration
and competition of many individuals are able to turn
out structured information from tag-based system
despite the lack of ontology and priori defined
semantics
11/03/2009
8
Photos and Tags in Flickr
Tags
Geo-Tag
Time-Tag
11/03/2009
9
Selected photos from Flickr
11/03/2009
10
Where is the beef?
 2008
amsterdam
canal europe
holland netherlands
noordholland north travel
The most frequently
occurring 20%
11/03/2009
11
Steps for extracting conceptualization of
place
Tags
crawling
Tags
Tags
Tags
database
geotagged & tagged photos
Spatial clustering
Co-occurrence analysis
Place concepts
11/03/2009
12
DBSCAN is a density-based algorithm
 Two global parameters:
 Eps: Maximum radius of the neighbourhood
 MinPts: Minimum number of points in an Epsneighbourhood of that point
 Core Object: object with at least MinPts objects
within a radius ‘Eps-neighborhood’
 Border Object: object that on the border of a cluster
p
q
MinPts = 5
Eps = 1 cm
11/03/2009
13
Density-Based Clustering: Background
 Density-reachable
 A point p is density-reachable
from a point q wrt Eps, MinPts if
there is a chain of points p1, …,
pn, p1 = q, pn = p such that pi+1 is
directly density-reachable from
pi
 Density-connected
 A point p is density-connected to
a point q wrt. Eps, MinPts if
there is a point o such that both,
p and q are density-reachable
from o wrt. Eps and MinPts.
p
p1
q
p
q
o
11/03/2009
14
DBSCAN: The Algorithm
 Arbitrary select a point p
 Retrieve all points density-reachable from p wrt Eps
and MinPts.
 If p is a core point, a cluster is formed.
 If p is a border point, no points are densityreachable from p and DBSCAN visits the next point
of the database.
 Continue the process until all of the points have
been processed.
11/03/2009
15
Density-Based Clustering: Results
11/03/2009
16
Co-occurrence analysis
 Co-occurrence can be interpreted as an indicator of
semantic similarity or an idiomatic expression.
 Co-occurrence assumes interdependency of the two
terms
 Semantic similarity is a concept whereby a set of
documents or terms within term lists are assigned a
metric based on the likeness of their meaning /
semantic content.
11/03/2009
17
Co-occurrence matrix
 The element at (i,j) is the tag count or frequency of
the i’th tag in the j’th photos
dj

 x1,1  x1,n 


T
ti      
 xm,1  xn ,m 
11/03/2009
18
Co-occurrence matrix
 A row in the matrix is a vector of the tag’s occurrence
in all photos:
t  [ xi ,1  xi ,n ]
T
i
 While a column is a vector of the occurrence of all tags
in a photo
 x1, j 


dj    
 xm , j 


11/03/2009
19
Co-occurrence correlations
tag-tag correlation matrix
Photo-tag matrix
11/03/2009
20
The correlation between the tag “amsterdam" and the
tags of several landmarks associated to Amsterdam
Correlation coefficient
Distance
11/03/2009
21
Conceptualizing places in 2500 meters
11/03/2009
22
Conceptualizing places 150 meters
11/03/2009
23
Conceptualizing places in 75 meters
11/03/2009
24
Schiphol airport
11/03/2009
25
Anne Frank House
11/03/2009
26
Rijksmuseum
11/03/2009
27
Conclusions and future works
 Without the use of suitable spatial clustering,
detailed information about a place is veiled by high
frequency tags
 A conceptualization of place is unveiled by tag cooccurrences at a suitable spatial scale
 Location-based applications can be developed to
suggest tags to users as they take photos
 In the future we will ground the semantics between
pairs of tags via the use of gazetteers or dictionaries
11/03/2009
28
Thank you for your attention!
Dongpo Deng
deng@itc.nl
INTERNATIONAL INSTITUTE FOR
GEO-INFORMATION SCIENCE AND
EARTH OBSERVATION
Download