ACM/IFIP/USENIX 11th International Middleware Conference, 2010 Prometheus: User-Controlled P2P Social Data Management for Socially-aware Applications Nicolas Kourtellis, Joshua Finnis, Paul Anderson, Jeremy Blackburn, Cristian Borcea*, Adriana Iamnitchi Department of Computer Science and Engineering, USF *Department of Computer Science, NJIT Social and Socially-aware Applications Applications may contain user profiles, social networks, history of social interactions, location, collocation 2 Problems with Current Social Information Management Application specific: Need to input data for each new application Cannot benefit from information aggregation across applications Typically, data are owned by applications: users don't have control over their data Hidden incentives to have many "friends": social information not accurate 3 Our Solution: Prometheus P2P social data management service: Receives data from social sensors that collect application-specific social information Represents social data as decentralized social graph Exposes API to share social information with applications according to user access control policies SOCIAL SENSORS SOCIALLYAWARE APPS PROMETHEUS Loopt ` ` ` ` ` ` ` CallCensor Foursquare 4 Outline Motivation Social Graph Management API and Access Control Prototype Implementation Evaluation over PlanetLab Summary Future Work 5 How is the Social Graph Populated? Social sensors report edge information to Prometheus: <ego, alter, activity, weight> Applications installed by user on personal devices Aggregate & analyze history of user's interactions with other users Two types of social ties: Object-centric: use of similar resources Examples: tagging communities on Delicious, repeatedly being parts of the same BitTorrent swarms People-centric: pair-wise or group relationships Examples: friends on Facebook, same company name 6 on LinkedIn, collocation from mobile phones Social Graph Representation Multi-edged, directed, weighted, labeled graph Each edge → a reported social activity Weight → interaction intensity Directionality reflects reality Allows for fine-grain privacy Prevents social data manipulation 7 Decentralized Graph Storage Each user has a set of trusted peers in the P2P network Peers it owns & peers owned by trusted users Each user’s sub-graph stored on all its trusted peers Improved availability in face of P2P churn P2P multicast used to synchronize information among trusted peers 1,2 B 1 1,2 C 2 1,2,3 D 3 2,3,4,5 E 4 3,4 F 5 3,5 > .1 ,0 > c i .2 0 , us l l <m tba o o <f A <m u <f oo sic, 0. <m tba us ll,0 1> i c, .3 > 0. 25 > B C <f <f ootb oo a tb ll,0 all .2 ,0 > .1 > . ,0 si c 25 > > .3 u l,0 .2> l a <m tb ,0 oo ing f < ik <h <m D > us .3 < 0 ic, fo , > c 3 o i 0. . <h tb 1> ,0 5> us l a l i k l m l 2 a , i . ng 0 . < b ,0 t ,0 3> oo g .3 < f i ki n > h < 1> 0. , c 2> i 0. us l, l m a < tb oo f < E A F < <f mus oo i <m tb c, 0 us all, .1> 0 ic, 0. .3> 25 > B <music,0.15> <fo <football,0.3> --- <music,0.15> <music,0.2> A PEER 1 ALL PEERS Trust Peer <football,0.3> Owns Peer <music,0.2> User ID E otb all, 0 .2 > > 0.3 ic, s u .3 > <m ll ,0 > a otb .25 <fo ing,0 ik <h D C 8 Encrypted P2P Storage Sensor data stored encrypted in P2P network Improves availability and protects privacy Sensors encrypt data with trusted group public key & sign with user private key Trusted peers retrieve user data, decrypt it, & create social graph User Public Key Private Key Group Public Key Private Key 9 Outline Motivation Social Graph Management API and Access Control Prototype Implementation Evaluation over PlanetLab Summary Future Work 10 Prometheus Application Interface Five social inference functions: Boolean relation_test (ego, alter, ɑ, w) User-List top_relations (ego, ɑ, k) User-List neighborhood (ego, ɑ, w, radius) User-List proximity (ego, ɑ, w, radius, distance) Double social_strength (ego, alter) Ego & alter don’t have to be directly connected Normalized result: consider ego’s overall activity Search all 2-hop paths 11 Application Example: CallCensor Socially-aware incoming call filtering Ring/vibrate/silence phone based on current social context and relationship with caller Invokes proximity() to determine current social context social_strength() to determine relationship with caller 12 Request Execution: social_strength() 1st hop 1st hop 2nd hop 1. 2. 3. 4. 5. 6. 7. Application sends request to a peer Peer forwards request to trusted peer Trusted peer enforces ACPs Trusted peer sends secondary requests Trusted peers enforce ACPs & reply Primary peer combines results Primary peer replies to application through contacted peer with final result 13 Access Control Policies User specifies ACPs upon registration ACPs stored on user’s trusted peer group Update them at any time Changes propagated through multicast mechanism Applied for each inference request Control relations, labels, weights & locations Example: Alice’s ACPs relations: hops-2 hiking-label: lbl-hiking work-label: lbl-work general-label: --weights: --location: hops-1 blacklist: user-Eve 14 Outline Motivation Social Graph Management API and Access Control Prototype Implementation Evaluation over PlanetLab Summary Future Work 15 Prototype Implementation FreePastry Java implementation with support for DHT (Pastry) P2P storage (Past) Multicast (Scribe) Social graph management implemented in Python 16 Evaluation over PlanetLab Goals: 1. Assess performance under realistic network conditions (peers distributed around the world) 2. Assess performance at large scale using realistic workloads with large number of users 3. Assess the effect of socially-aware mapping of users onto trusted peers on system’s performance 4. Validate Prometheus with socially-aware application under real-time constraints (CallCensor) Metric: end-to-end response time 17 Large-Scale Evaluation Setup 100 PCs around the globe RTT~200-300ms 1000 users: synthetic social graph Random vs. socially-aware trusted peer assignment 10 & 30 users assigned per peer Workloads for: Social sensor inputs based on Facebook study Neighborhood requests based on Twitter study Social strength requests based on BitTorrent study Applied a timeout of 15 seconds to fulfill a 1-hop request in PlanetLab 18 Neighborhood Request Results Neighborhood Requests (10 users/peer) Neighborhood Requests (30 users/peer) 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 CDF CDF random - 1hop 0.6 random - 1hop 0.5 random - 2hop 0.5 random - 2hop 0.4 random - 3hop 0.4 random - 3hop 0.3 social - 1hop 0.3 social - 1hop 0.2 social - 2hop 0.2 social - 2hop 0.1 social - 3hop 0 0.1 social - 3hop 0 0 10,000 20,000 30,000 End-to-end response time (msecs) 0 10000 20000 30000 End-to-end response time (msecs) Socially-aware assignment of users onto peers results in faster response time Message overhead reduced by an order of magnitude Replication for improved availability does not induce high overhead 19 Social Strength Request Results Social Strength Requests 1.0 0.9 0.8 0.7 CDF 0.6 0.5 0.4 random - 10 users per peer 0.3 random - 30 users per peer 0.2 social - 10 users per peer 0.1 social - 30 users per peer 0.0 0 10000 20000 Average end-to-end response time (msecs) Similar performance with 2-hop Neighborhood Requests Search all 2-hop paths from source to destination 20 CallCensor Evaluation Setup CallCensor implemented and tested on Nexus Android phone 100 users: real social graph Volunteer students from NJIT Two social sensors Collocation from Bluetooth 45 & 90 minutes threshold Friendship from Facebook 3 USA PlanetLab peers Socially-aware trusted peer assignment 21 CallCensor Results Met real-time performance constraint: response arrives before call forwarded automatically to voicemail 22 Summary Users of Prometheus: Decide what personal social data are collected by installing/configuring social sensors Cooperate to store and manage their social data in a decentralized fashion Own and control access to their data Prometheus enables: Socially-aware applications that utilize social data collected from multiple sources Accurate social world representation through multiedged, labeled, directed and weighted graph Improved performance through socially-aware P2P system design 23 Future Work Improve Prometheus performance Network optimizations Caching of inference request results Develop new social sensors Develop new socially-aware applications & services Study tolerance to malicious attacks Exposure of social information to intermediate peers during request execution Manipulation of social connections to alter the structure of the social graph 24 Thank you! This work was supported by NSF Grants: CNS 0952420, CNS 0831785, CNS 0831753 http://www.cse.usf.edu/dsg/mobius nkourtel@mail.usf.edu 25 Why P2P? 1st alternative: Free Centralized Service No incentives or business model for free storage and service of encrypted data 2nd alternative: Cloud Cost for transferring and storing data Tradeoff between privacy & inference functionality 2nd alternative: mobile phones Limited energy and computation power Not always online (service unavailability) Not always synchronized, for fast and efficient inference support 26 Prometheus vs. Facebook? Both collect social information of users from multiple sources but: Facebook is limited to input from Facebook-controlled sources Prometheus accepts input from any user-defined social source (sensor) User-control of social information Prometheus allows full user-control: Storage of data Exposure of data to users, applications & services Facebook allows very limited user-control: Exposure of data to users, applications & services* Always at odds with its business model 27 Updating the Social Graph Social data for each user stored as append-only file in P2P network Atomic appends using lock file for synchronization Trusted peers periodically check for new inputs for a user May have inconsistent data for short time periods Not major problem: social graphs do not change frequently After authentication, new input is merged with the social graph of the relevant user 28 Social Sensors: Challenges Identifying activity tags: Mine text for keywords (emails, sms, blogs,...) Reverse geo-coding to find where (co)located Predefined labels or dictionaries and ontologies Quantifying interactions (assigning weights): Frequency, duration, time in-between interactions Familiar strangers versus active social interactions 29 Related Work SONAR: aggregation of social information only within an enterprise context (emails, IM, etc) to improve information flow RE: 2-hop relationships to automatically populate email white-lists; Prometheus: can extract social knowledge from larger portions of the graph than direct or 2-hop neighborhood Social information and requests can cross application boundary contexts Persona: Attribute based encryption of data for sharing between apps while applying fine-grained access policies from users PeerSoN: direct data exchange between users’ devices Prometheus: trusted peers reliably store & exchange social data Prometheus: social incentives in trusted peer selection to reduce churn Prometheus: fully decentralized on P2P network Vis-á-Vis: store data on Virtual Independent Servers on the cloud to deal with churn MobiSoc: logically centralized -> “big brother concerns” MobiClique: Delay tolerant networking middleware for disseminating social information 30