Aegis: A Semantic Implementation of Privacy as Contextual Integrity in Social Ecosystems Imrul Kayes, Adriana Iamnitchi Social Privacy Risks 2 Why Does This Happen? • Inappropriate sharing and transferring of information • (Permissive) Default privacy settings by OSN provider • Because they can • Lack of universal framework that establishes what is right and wrong • Users do not change default settings • 99% Twitter users • >80% Facebook users • When they do, they get it wrong 3 Evolution Towards Social Ecosystems Applications Social Inference API Social Data Management Personal Aggregators Social Sensors Social Signals Iamnitchi et al. ”The Social Hourglass: an Infrastructure for Socially-aware Applications and Services." IEEE Internet Computing (2012). 4 Privacy in Social Ecosystems • Social Ecosystems amplify privacy concerns – Aggregated data from different contexts of activity – A more complete (uncomfortable?) digital recording of a person’s life – Social applications from different contexts of activity • Default privacy settings become critical 5 Privacy as Contextual Integrity • The right to appropriate flow of personal information • Based on two life facts: – transfer of personal information happens in a social context – people alter behavior to correspond with the norms of the context • Two norms: – Norms of appropriateness – Norms of distribution Nissenbaum, Helen. "Privacy as contextual integrity." Washington Law Review 79.1 (2004). 6 Our Solution • Ontology-based social ecosystem data model to capture user online data semantics – Model social contexts – Model user roles • Generate default privacy from social data based on Nissembaum’s contextual integrity framework • Extensible, fine-grained default policy customizable by users • Prototype implementation and experimental evaluation on three real-world large networks 7 e, of a of e a y y p- al d s, y new social signals (see next section): the developers of social sensors have to be aware of the ontology of the social contexts Ontology-based Social Data to which the sensors report, in order toEcosystems maintain structural data representation. Another way to extend the social ecosystem is Model by extending a social context itself when new relevant social • Set ofbecome entities, instances, relations and axioms signals available: functions, for example, Facebook recently • A vocabulary for social ecosystems added a service called Gifts which allows users to buy presents • Provides formal and structured representation of user’s data for their friends. Consequently, the social context model needs and social spheres to be adaptive to interoperability accommodate additions of new contexts. • Gives semantic Ontologies designing a scalable context model. • High-levelhelp logicininference is possible Context Upper Ontology Friendship Context isa Professional Context Gaming Context Blogging Context 'Other' Context 8 Context specific ontologies es phi Tro ges Bad s ard e ar ns ers' Gamoup Gr es Gam ea isT m ns m joi Content owned Games son Per Comment O ate isI n f Per son Experience Industry Job Professional Context profe s Academia Pro fes Gro siona up l Note Event Check edIn isF rie ndO f isL oc ate d Photo In Video tI sho Place Person on ndati mme o c e hasR isC f eO gu a e oll ber ce rien xpe E s ha siona lMem ha sJ ob Person s attend Group tak en In Aw in g Rac zle Puz y teg Stra cr vo eate lv e s dI n Gaming Context Friendship Context Recommendation Class owl: Property rdfs: subClassOf Legend: 9 n System Model • Unrestricted set of disjoint social contexts • A user belongs to only one social context at any time • A user can have one or more roles in every social context s/he is part of • Each piece of data (resource) is assigned (created) to only one context • Shared data(resources) are replicated in each of the other users’ current contexts • A request for a resource is made on behalf of the requester’s role in the particular context in which the requester is when the request is made • A request specifies an action, which could be read, write, delete or replicate to another user’s ownership. 10 Photo Video tIn extracting and applying the default policies as well. This component communicates with the Social Data Management Layer which implements social contexts and roles. Architecture or is an vice as serving Social Data Extractor Policy Evaluator Default Policy Repository User Defined Policy Extractor Aegis: Privacy Management Layer Contextual Policy Definer Ontology SEKB User Social Data Management Layer Ontology Social Data Acquisition & Aggregation Layer Policy Editor ntology exts. A specific Socially-aware Applications A1 S11 S21 S22 A2 S32 A3 S33 S43 11 Policy Specification • A policy is defined as a set of RDF statements • Policies obey the two information norms of CI Norms of appropriateness: Bob’s colleagues can read his professional groups in the Professional context Yes Professional Groups? Alice Colleagues ASK where { ?req rdf:type p:requestor. ?req p:allowed p:read. p:read p:performedOn Bob. ?req se:isColleagueOf Bob. Bob se:professionalMember ?group.} Bob Charlie 12 Policy Specification Norms of distribution: policy restricts the access to Bob’s photos if they are shared Shared contents (e.g., Photo) Alice friends ASK where { ?req rdf:type p:requestor. ?req p:allowed p:read. p:read p:performedOn Bob. ?req se:isFriendOf Bob. Bob se:hasPhoto ?photo. ?photo se:status se:notShared} Bob Charlie 13 p. } personalized policy will be enforced (see policy evaluation flow chart from Figure 4). As in the data model we have hierarchy among classes (which eventually define resources) and a group of classes belong to a context, we can infer the context from requested resource. Note that a default auto generated policy could be personalized by the user and in this case, defines the personalized policyamong will be evaluated. For(user Ontology hierarchy resources example, from a requested resource recommendation, a context data) inference is possible from the following SPARQL query to Context is possible for each resource knowledgeinference base: Context Inference • • emove auxiliary formation from SEKB Deny access act> , where n instance of PREFI X r df : <ht t p: / / www. w3. or g/ 1999/ 02/ 22r df - synt ax- ns# PREFI X r df s: <ht t p: / / www. w3. or g/ 2000/ 01/ r df - schema# PREFI X se: <ht t p: / / www. dsg. cse. usf . com/ se> SELECT ?super Cl ass Wher e { se: r equest edResour ce r df s: subCl assOf ?super Cl ass . } The policy engine will extract the policy of the inferred context and execute it. 14 Al i ce p: al l owed p: r ead. p: r ead p: per f or medOn Bob. Al i ce se: i sCol l eagueOf Bob. Bob se: pr of essi onal Member ?gr oup. } Our policy rep for each resourc infer the contex personalized pol flow chart from hierarchy among and a group of the context from auto generated p in this case, the example, from a r inference is poss knowledge base: Request Handling Flow Chart Accept request (requestor, resource, action) Add request specific auxiliary information to SEKB Default Policy? Yes Infer context from the requested resource Extract policy No Extract policy Execute Query Grant access Yes Triple returned? remove auxiliary information from SEKB No Deny access Fig. 4: Request handling process. 15 PREFI X r df : < r df - synt ax- n PREFI X r df s: r df - schema# PREFI X se: <h SELECT ?supe Wher e { se: r eque ?super Cl ass Prototype Implementation • Implemented the prototype in Java Platform Standard Edition 6 (Java SE 6) • Jena’s APIs for RDF data management • Ontology: Jena’s API for handling OWL ontologies • leveraged TDB for persistent storage of knowledge base • SPARQL: Jena’s query engine 16 Experimental Evaluation • Objective: – Performance of the policy engine in executing default policies for realistic workloads – Scalability of the policy engine in executing default policies – Overhead induced by default policies 17 Experimental Evaluation • Three real networks • Thirteen test cases (100~70,000 users): snowball sampling from the networks • Social ecosystems knowledge base including Person, Relationships and Groups • Two types of responses - positive authorization access control response - negative authorization access control response 18 Access time increases linearly with the size of the SEKB Positive authorization Negative authorization 19 Number of requests answered per second Positive and negative authorization take about the same time • TDB data structures are threaded B+Trees • long scans (negative authorizations) proceeds without needing to traverse the branches of the tree 20 Performance decreases with increasing users • Increased system memory to realistic capacity for an in-production server • Distributed solutions for data management 21 Overhead induced by default policies is Statistically Insignificant Time taken(ms) 10 4 10 3 10 2 Slashdot Slashdot with BlogCatalog BlogCatalog with Facebook Facebook with without default without default without default policy policy policy policy policy policy 101 100 10 0 50 0 10 00 25 00 50 0 0 75 00 10 00 0 20 0 00 30 00 0 40 00 0 Number of users in SEKB 22 50 00 0 60 00 0 70 0 00 Future Work • Test the effects of default policies - on applications that are too restrictive - user satisfaction with user-based surveys • Formalize and analyze potential privacy attacks • Understand the system in different platform settings 23 Summary • Propose an ontology-based social ecosystem data model to capture user social data • Employ semantic web technologies to generate default privacy polices based on Nissembaum’s contextual integrity theory • Provide an architecture and prototype implementation of privacy model • Experimental evaluation on three real-world large networks to demonstrate the applicability in practice 24 Thank You! Aegis: A Semantic Implementation of Privacy as Contextual Integrity in Social Ecosystems Imrul kayes, Adriana Iamnitchi http://www.cse.usf.edu/dsg/ imrul@mail.usf.edu 25 Back Up Slides 26 Social Sensors Consume social signals: • Location/collocation • Schedule (Google calendar) • Mobile phone activity (calls, etc) • Online social network interactions • Email • Shared content (Netflix, CiteULike) • Personal relations (family) … 27 Social Sensors • Report on behalf of ego: – Alter, the person ego is interacting with – An activity tag: e.g., “outdoors”, “dining” • Based on content, location, predefined labels, semantic web (ontologies), etc. – A weight: e.g., 0.15 • Run on ego’s mobile devices, desktop, or on the web • Process user interactions – To reduce noise – To distinguish between routine and meaningful interactions 28 Aggregators • Act as the user’s personal assistant • Runs on trusted device (cell phone) • Responsible for o Managing access to social signal apps o Personalization o Identity management 29 Related Work • Squicciarini et al. “PriMa” – auto generates access control policies for users – Based on factors such as average privacy preference of similar and related users, accessibility of similar items in similar and related users, closeness of owner and access or popularity of the owner – A large number of factors and their parametrized tuning is required – No performance evaluation 30 Related Work • Shehab et al. “PolicyMgr” – leverages user provided example policy settings as training sets and build classifiers that are the basis for autogenerated policies – Practicality in terms of response time has not yet been shown 31 Related Work • Our privacy model differs from other solutions – We focused on generating default policies for a social ecosystem that deals with users’ aggregated social data from different domains – We considered a privacy framework proposed by social theorists and translated it into an architecture and proof-of-concept implementation 32