Semiotic dynamics and collaborative tagging By Ciro Cattuto, Vittorio Loreto, and Luciano Pietronero Present by Diyue Bu About Author: Ciro Cattuto “My work focuses on measuring and understanding complex phenomena in systems that entangle human behaviors and digital platforms. I am interested in Computational Social Science, Data Science, Web Science, Infectious Disease Dynamics and Digital Epidemiology.” “I currently lead the Data Science Laboratory of the ISI Foundation, where I also serve as Research Director. I am a founder and a principal investigator of the SocioPatterns collaboration.” http://www.cirocattuto.info/ Collaborative tagging “The basic unit of information in a collaborative tagging system is a (user, resource, {tags}) triple” “tag-cloud” https://delicious.com/ Cattuto, Ciro, Vittorio Loreto, and Luciano Pietronero. "Semiotic dynamics and collaborative tagging." Proceedings of the National Academy of Sciences 104.5 (2007): 1461-1464. Task and Result Task: extracting the resources associated with a given tag X and study the statistical distribution of tags cooccurring with X. Cattuto, Ciro, Vittorio Loreto, and Luciano Pietronero. "Semiotic dynamics and collaborative tagging." Proceedings of the National Academy of Sciences 104.5 (2007): 1461-1464. Stochastic Model: Zipf’s law Result for this case similar to Zipf’s law. Zipf’s law: “given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table. Thus the most frequent word will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word, etc.” http://en.wikipedia.org/wiki/Zipf's_law Difference from Zipf’s Law Difference: “the low-rank part of the frequency-rank curves exhibits a flattening typically not observed in systems strictly obeying Zipf’s law” Reason: “more general tags (semantically speaking) will tend to cooccur with a larger number of other tags.” Cattuto, Ciro, Vittorio Loreto, and Luciano Pietronero. "Semiotic dynamics and collaborative tagging." Proceedings of the National Academy of Sciences 104.5 (2007): 1461-1464 http://en.wikipedia.org/wiki/Zipf's_law. “A Yule-Simon Model with Long-Term Memory” Modification: add power-law memory kernel to Yule-Simon model Supported by the test on correlation of time and tag’s occurrence Transformation: “rich-get-richer” Where “Qt(x) = a(t)/(x + τ). a(t) is a normalization factor, and τ is a characteristic time scale over which recently added words have comparable probabilities.” τ: semantic breadth Cattuto, Ciro, Vittorio Loreto, and Luciano Pietronero. "Semiotic dynamics and collaborative tagging." Proceedings of the National Academy of Sciences 104.5 (2007): 1461-1464 Comparison of theory & experimental result Cattuto, Ciro, Vittorio Loreto, and Luciano Pietronero. "Semiotic dynamics and collaborative tagging." Proceedings of the National Academy of Sciences 104.5 (2007): 1461-1464. Result Interpretation “users of collaborative tagging systems share universal behaviors that, despite the intricacies of personal categorization, tagging procedures, and user interactions, appear to follow simple activity patterns.” Users’ behavior reveals “two main aspects of collaborative tagging: (i) a frequency-bias mechanism related to the idea that users are exposed to each other’s tagging activity; (ii) a notion of memory, or aging of resources, in the form of a heavy-tailed access to the past state of the system” Questions & Discussion What features of users’ behavior or collaborative tagging make the model related to memory kernel? What aspects of current online business pattern could be improved according to this finding on users’ behavior against collaborative tagging? What kind of positive & negative influences will this user’s behavior have? How to reduce negative influences by modifying the feedbacks users get from resource and tags? What would happen if the tag-cloud does not present each tag’s frequency of appearance (no difference on each word’s size)?