slides - Nasirifard`s

Digital Enterprise Research Institute
Social People-Tagging vs.
Social Bookmark-Tagging
Peyman Nasirifard, Sheila Kinsella, Krystian Samp,
Stefan Decker
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
www.deri.ie
Bookmark-tagging and People-tagging
Digital Enterprise Research Institute
todo
www.deri.ie
nlp
friendly
music
research
technician
Motivation
Digital Enterprise Research Institute
www.deri.ie
Understand better how people tag each
other
 A starting point for tag recommendation in
frameworks based on people-tagging

 Access
control mechanisms
 Information

filtering mechanisms
We are especially interested in subjectivity
of tags
Main questions
Digital Enterprise Research Institute
www.deri.ie
How do tags differ for resources of different
categories? (person, event, country and city)
 How do tags for Wikipedia pages about
persons differ from tags for friends?
 How do tags differ with age, gender of
taggee?

Data collection
Digital Enterprise Research Institute
1.
www.deri.ie
Bookmark tags

Wikipedia articles: Person, Event, Country, City
Data collection
Digital Enterprise Research Institute
2.
People tags

http://blog.* network of blog sites


.ca, .co.uk, .de, .fr
Google Translate to convert non-English to
English
www.deri.ie
Dataset
Digital Enterprise Research Institute
Source
Wikipedia
Blog sites
Category
Person
Event
Country
City
Friend
www.deri.ie
# Items
4,031
1,427
638
1,137
2,927
# Tags # Unique
75,548
14,346
8,924
2,582
13,002
3,200
4,703
1,907
17,126
10,913
Top tags – Wikipedia articles
Digital Enterprise Research Institute
www.deri.ie
Person
Event
Country
City
wikipedia
history
wikipedia
travel
people
war
history
wikipedia
philosophy
wikipedia
travel
italy
history
ww2
geography
germany
wiki
politics
africa
history
music
wiki
culture
london
politics
military
wiki
uk
art
battle
reference
wiki
books
wwii
europe
places
literature
iraq
country
england
Top tags – blog sites
Digital Enterprise Research Institute
www.deri.ie
.de
.fr
.ca & .co.uk
music junkie
art
funny
nice
politics
music
live
music
life
funny
kind
kk friend
dear
adorable
funky
intelligent
love
friendly
pretty
nice
lovely
sexy
drawing
cool
love
friendship
sexy
honest
trustworthy
love
Distribution of tags
Digital Enterprise Research Institute
www.deri.ie
Subjectivity of tags
Digital Enterprise Research Institute
www.deri.ie
Top 100 tags for each category
 25 annotators each categorised 100 tags

 Objective
e.g. “london”
 Subjective
e.g. “jealous”
 Uncategorised

e.g. “abcxyz”
Average inter-annotator agreement: 86%
subjective
objective
uncategorized
Digital Enterprise
Research Institute
Friend
www.deri.ie
Person
Country
City
Event
Randomly selected tags
Digital Enterprise Research Institute
www.deri.ie
Before we looked at top tags, but what
about long-tail tags?
 We also asked annotators to categorise 100
randomly chosen tags from each group

 Much
higher rate of uncategorised (~3x)
 Lower
inter-annotator agreement (76%)
 Less
clear a meaning than the top tags, so
probably less useful for applications like
information filtering
Linguistic categories
Digital Enterprise Research Institute
www.deri.ie
Automatic classification (WordNet)
 Noun/verb/adjective/adverb/uncategorised

Digital Enterprise Research Institute
Adjective
www.deri.ie
Adverb
Verb
Noun Uncategorised
Age and gender of taggees
Digital Enterprise Research Institute

www.deri.ie
Generated sets of tags corresponding to
ages brackets and genders
 Removed
tags that refer to a specific gender
Asked 10 participants if they could predict
age and gender
 Results:

 Differences
 Differences
between gender were not perceptible
between younger and older were
perceptible (and younger were more subjective)
Conclusions
Digital Enterprise Research Institute
www.deri.ie
Subjectivity: Articles of different categories
are tagged similarly, but friends are
assigned subjective tags more frequently
 Consequence: frameworks built on persontags will need to handle more potentially
unreliable tags

 Controlled

vocabularies?
Future work: Twitter Lists as person
annotations for information filtering