Digital Enterprise Research Institute Social People-Tagging vs. Social Bookmark-Tagging Peyman Nasirifard, Sheila Kinsella, Krystian Samp, Stefan Decker Copyright 2009 Digital Enterprise Research Institute. All rights reserved. www.deri.ie Bookmark-tagging and People-tagging Digital Enterprise Research Institute todo www.deri.ie nlp friendly music research technician Motivation Digital Enterprise Research Institute www.deri.ie Understand better how people tag each other A starting point for tag recommendation in frameworks based on people-tagging Access control mechanisms Information filtering mechanisms We are especially interested in subjectivity of tags Main questions Digital Enterprise Research Institute www.deri.ie How do tags differ for resources of different categories? (person, event, country and city) How do tags for Wikipedia pages about persons differ from tags for friends? How do tags differ with age, gender of taggee? Data collection Digital Enterprise Research Institute 1. www.deri.ie Bookmark tags Wikipedia articles: Person, Event, Country, City Data collection Digital Enterprise Research Institute 2. People tags http://blog.* network of blog sites .ca, .co.uk, .de, .fr Google Translate to convert non-English to English www.deri.ie Dataset Digital Enterprise Research Institute Source Wikipedia Blog sites Category Person Event Country City Friend www.deri.ie # Items 4,031 1,427 638 1,137 2,927 # Tags # Unique 75,548 14,346 8,924 2,582 13,002 3,200 4,703 1,907 17,126 10,913 Top tags – Wikipedia articles Digital Enterprise Research Institute www.deri.ie Person Event Country City wikipedia history wikipedia travel people war history wikipedia philosophy wikipedia travel italy history ww2 geography germany wiki politics africa history music wiki culture london politics military wiki uk art battle reference wiki books wwii europe places literature iraq country england Top tags – blog sites Digital Enterprise Research Institute www.deri.ie .de .fr .ca & .co.uk music junkie art funny nice politics music live music life funny kind kk friend dear adorable funky intelligent love friendly pretty nice lovely sexy drawing cool love friendship sexy honest trustworthy love Distribution of tags Digital Enterprise Research Institute www.deri.ie Subjectivity of tags Digital Enterprise Research Institute www.deri.ie Top 100 tags for each category 25 annotators each categorised 100 tags Objective e.g. “london” Subjective e.g. “jealous” Uncategorised e.g. “abcxyz” Average inter-annotator agreement: 86% subjective objective uncategorized Digital Enterprise Research Institute Friend www.deri.ie Person Country City Event Randomly selected tags Digital Enterprise Research Institute www.deri.ie Before we looked at top tags, but what about long-tail tags? We also asked annotators to categorise 100 randomly chosen tags from each group Much higher rate of uncategorised (~3x) Lower inter-annotator agreement (76%) Less clear a meaning than the top tags, so probably less useful for applications like information filtering Linguistic categories Digital Enterprise Research Institute www.deri.ie Automatic classification (WordNet) Noun/verb/adjective/adverb/uncategorised Digital Enterprise Research Institute Adjective www.deri.ie Adverb Verb Noun Uncategorised Age and gender of taggees Digital Enterprise Research Institute www.deri.ie Generated sets of tags corresponding to ages brackets and genders Removed tags that refer to a specific gender Asked 10 participants if they could predict age and gender Results: Differences Differences between gender were not perceptible between younger and older were perceptible (and younger were more subjective) Conclusions Digital Enterprise Research Institute www.deri.ie Subjectivity: Articles of different categories are tagged similarly, but friends are assigned subjective tags more frequently Consequence: frameworks built on persontags will need to handle more potentially unreliable tags Controlled vocabularies? Future work: Twitter Lists as person annotations for information filtering