Semiotic dynamics and collaborative tagging

advertisement
Semiotic dynamics and
collaborative tagging
By Ciro Cattuto, Vittorio Loreto, and Luciano
Pietronero
Present by Diyue Bu
About Author: Ciro Cattuto
 “My work focuses on measuring and
understanding complex
phenomena in systems that
entangle human behaviors
and digital platforms. I am interested
in Computational Social Science, Data
Science, Web Science, Infectious Disease
Dynamics and Digital Epidemiology.”
 “I currently lead the Data Science
Laboratory of the ISI Foundation,
where I also serve as Research
Director. I am a founder and a principal
investigator of
the SocioPatterns collaboration.”
http://www.cirocattuto.info/
Collaborative tagging
 “The basic unit of
information in a
collaborative tagging
system is a (user,
resource, {tags}) triple”
 “tag-cloud”
 https://delicious.com/
Cattuto, Ciro, Vittorio Loreto, and Luciano Pietronero. "Semiotic dynamics and
collaborative tagging." Proceedings of the National Academy of Sciences 104.5 (2007):
1461-1464.
Task and Result
 Task: extracting the resources associated with a given tag X
and study the statistical distribution of tags cooccurring with
X.
Cattuto, Ciro, Vittorio Loreto, and Luciano Pietronero. "Semiotic dynamics and collaborative
tagging." Proceedings of the National Academy of Sciences 104.5 (2007): 1461-1464.
Stochastic Model: Zipf’s law
 Result for this case similar to Zipf’s law.
 Zipf’s law: “given some corpus of natural language utterances,
the frequency of any word is inversely proportional to its
rank in the frequency table. Thus the most frequent word will
occur approximately twice as often as the second most
frequent word, three times as often as the third most
frequent word, etc.”
http://en.wikipedia.org/wiki/Zipf's_law
Difference from Zipf’s Law
 Difference: “the low-rank part of the frequency-rank curves
exhibits a flattening typically not observed in systems strictly
obeying Zipf’s law”
 Reason: “more general tags (semantically speaking) will tend
to cooccur with a larger number of other tags.”
Cattuto, Ciro, Vittorio Loreto, and Luciano Pietronero. "Semiotic dynamics and collaborative tagging." Proceedings of the National Academy of Sciences 104.5 (2007): 1461-1464
http://en.wikipedia.org/wiki/Zipf's_law.
“A Yule-Simon Model with Long-Term Memory”
 Modification: add power-law memory kernel to Yule-Simon model
 Supported by the test on correlation of time and tag’s occurrence
 Transformation: “rich-get-richer”
Where “Qt(x) = a(t)/(x + τ).
a(t) is a normalization factor,
and τ is a characteristic time
scale over which recently added
words have comparable
probabilities.”
τ: semantic breadth
Cattuto, Ciro, Vittorio Loreto, and Luciano Pietronero. "Semiotic dynamics and collaborative tagging." Proceedings of the National Academy of Sciences 104.5 (2007): 1461-1464
Comparison
of
theory &
experimental
result
Cattuto, Ciro, Vittorio Loreto, and
Luciano Pietronero. "Semiotic dynamics
and collaborative tagging." Proceedings of
the National Academy of Sciences 104.5
(2007): 1461-1464.
Result Interpretation
 “users of collaborative tagging systems share universal
behaviors that, despite the intricacies of personal
categorization, tagging procedures, and user interactions,
appear to follow simple activity patterns.”
 Users’ behavior reveals “two main aspects of collaborative
tagging: (i) a frequency-bias mechanism related to the idea
that users are exposed to each other’s tagging activity; (ii) a
notion of memory, or aging of resources, in the form of a
heavy-tailed access to the past state of the system”
Questions & Discussion
 What features of users’ behavior or collaborative tagging make the
model related to memory kernel?
 What aspects of current online business pattern could be
improved according to this finding on users’ behavior against
collaborative tagging?
 What kind of positive & negative influences will this user’s
behavior have? How to reduce negative influences by modifying
the feedbacks users get from resource and tags?
 What would happen if the tag-cloud does not present each tag’s
frequency of appearance (no difference on each word’s size)?
Download