Project topics – Private data management Nov. 2011 Topic 1: Survey on the Status of Privacy Specifications in Online Social Networks • Study at least 8 online social networks (OSNs) (including Facebook, LinkedIn, Google+ and Flickr) and report of how each one of them handles privacy specifications. • The output of this study is expected to be: – A characterization of what a user is allowed to specify in terms of "what piece of information" (e.g., photo, wall post, status update, etc) is visible to "what type of users" (e.g., friends, friends-of-friends, lists, etc) and what is the default setting. As an example of the expected output of this study consider Table 1 in [1] but more detailed (*) – A ranking of the 8 OSNs with regards to "how much" private these OSN are, using one or more appropriate metrics, for example, using ideas from [2] Read [3] for some nice ideas on how to improve the current situation. • • • [1] Barbara Carminati, Elena Ferrari, Andrea Perego: Enforcing access control in Web-based social networks. ACM Trans. Inf. Syst. Secur. 13(1): (2009) [2] Kun Liu, Evimaria Terzi: A Framework for Computing the Privacy Scores of Users in Online Social Networks. TKDD 5(1): 6 (2010) [3] Krishna P. Gummadi, Alan Mislove, and Balachander Krishnamurthy. Addressing the Privacy Management Crisis in Online Social Networks. In The IAB Workshop on Internet Privacy, December 2010. (Position Paper) Topic 1: Survey on the Status of Privacy Specifications in Online Social Networks Table 1 of [1] Topic 2: Experimental Evaluation of the Privacy of a Real OSN • Choose 2 real data sets from OSNs (or 2 different subsets of the same data set) • Build the corresponding social network graphs. Check the web page for some links of where to get datasets. • Evaluate the resulting graphs in terms of – (1) k-degree anonymity [4], and – (2) an additional k-anonymity based criteria of your choice. [4] Kun Liu, Evimaria Terzi: Towards identity anonymization on graphs. Local recoding with hierarchies • How do we anonymize a table with categorical attributes in the QI set, – with local recoding + – with hierarchies playing a role in the process? • Implement+test the KACA algorithm • Jiuyong Li, Raymond Chi-Wing Wong, Ada WaiChee Fu, Jian Pei. Anonymization by Local Recoding in Data with Attribute Hierarchical Taxonomies. IEEE Trans. Knowl. Data Eng. 20(9): 1181-1194 (2008) 6 Local recoding with hierarchies (2) • Another approach on the topic: – “Cut-off” a single ancestor value per detailed value • Implement + test the proposed algorithm • Junqiang Liu, Ke Wang. On Optimal Anonymization for L(+)-Diversity. Proceedings of 26th IEEE International Conference on Data Engineering, March 1-6, 2010, Long Beach, California, USA Toolkits • Do sth with existing toolkits (Cornell, Udallas) – Port Cornell’s toolkit to MySQL / generic DB ? – Port Udallas to java ? • Convert UoI code to toolkit?