On the Semantic Annotation of Places in Location

advertisement
On the Semantic Annotation of Places in
Location-Based Social Networks
Mao Ye1, Dong Shou1, Wang-Chien Lee1, Peifeng Yin1, Krzysztof Janowicz2
1Department
2Department
of Computer Science and Engineering
The Pennsylvania State University
{mxy177,dus212,wlee,pzy102}@cse.psu.edu
of Geography
University of California, Santa Barbara
{jano}@geog.ucsb.edu
Introduction and Motivation
 Location-based Social Networks
 Tags are missing
 Tags are important
 E.g., Facebook Place and Foursquare
 In our Foursquare and Whirrl
dataset, there are a lot of
places missing tags
 Business categorization
 Location search
 Place recommendation
Places
with tags
 Data cleaning
 …
67%
 User check-in places
Places
missing
33% tags
Places
with tags
Places
missing
32%tags
68%
Foursquare
Problem Description
Whrrl
SAP Framework
 Place semantic annotation (SAP) problem
Check-in logs
Binary Classifier
For tag t1
 Multi-label classification problem
 Input
Place
 User check-in logs <who, where, when>
Binary Classifier
For tag t2
Feature Extraction (FE)
Component
 Feature extraction (FE)
 Some places are tagged
Binary Classifier
For tag tm
 Check-in logs  features
 Output
 Features to describe a place
 Infer tags for the rest places
FE- Explicit Pattern
FE- Implicit Relatedness
Day 1
Day 2
00:00
Day 3
Restaurant
Restaurant
Spa
Shopping
Restaurant
23:59
Total number of unique visitors
Maximum number of check-in of a single user
Restaurant
Day 6
Day 7
Day 8
Bars
Health
?
Restaurant
Total number of check-in
Day 5
Bars
Gym
EP Feature List
Day 4
Beauty
Restaurant
Restaurant
Restaurant
Shopping
Shopping
Restaurant
Restaurant
Bars
Places checked in by the same user at around the same time
(not necessarily the same day) are probably in the same category
Daily probability of check-in
Hourly probability of check-in
Evaluation
Whrrl.com. 5,892 users, 53,432 places and 199 types of tags
Comparison: EP, IR and SAP (EP+IR)
Category
Restaurant&Food (Res)
Shopping (Sh)
Nightlife (NL)
Percentage
37%
18%
19%
 According to Yelp, we map 199 tags into 21 categories.
Download