The Social Hourglass: Enabling Socially-aware Applications and Services Adriana Iamnitchi University of South Florida anda@cse.usf.edu Much Social Information Available • Connects people through relationships – Object centric: use of same objects – Person centric: declared relationships or co-participation in events, groups, etc. Mining Social Data • • • • • • Spam filtering Sybil identification Personalized search Target marketing Medical emergency notifications … Current Approach: Vertically Integrated Socially-aware Applications Challenges with Current Approach • Application-limited collection and use of social information – High bootstrap cost – Limited (potentially inaccurate) information. E.g., Information from online social networks • Hidden incentives to have many “friends” • All relationships equal • Symmetric relationships • Newer proposals to merge different sources of social (and sensor) information for one app – Specifically targeting context awareness 5 Motivating Application: CallCensor 6 Motivating Application: Sofa Surfer 7 Motivating Application: Data Placement 8 Proposal: An Infrastructure for Social Computing Sofa Surfer Roommate Finder CallCensor … Objective An infrastructure that: • Can fuse information from various sources • Allow user to control own information – What is collected – Where it is stored – Who can access it • Provide social knowledge to a variety of applications: – Social inferences (may be non-trivial) 10 Outline • • • • • Motivation The Social Hourglass architecture Social Sensors (work in progress) Personal Aggregator (some ideas) Social Knowledge Service: Prometheus (Kourtellis et al, Middleware 2010) – Data Management – API for social inferences – Experimental evaluation (on PlanetLab) • Summary 11 The Social Hourglass Architecture Applications Sofa Surfer Roommate Finder CallCensor Applications make use Social Inference API social kno Social knowledge servic storing social data. Stor Management applications A1 A2 Social Data Personal Aggregators Personal aggregators c A3 sens S11 S21 S22 S32 S33 S43 SocialSensors sensors analyze Social socia Social signals Social Signals 12 Social Sensors Consume existing social signals • Location • Collocation • Schedule (e.g., Google calendar) • Mobile phone activity (calls, sms) • Online social network interactions • Email • Personal relations (family) • Shared content • Shared interest (e.g., CiteULike) • … 13 Social Sensors • Report on behalf of ego: – Alter, the person ego is interacting with – An activity tag: e.g., “outdoors”, “dining” • Based on content, location, predefined labels, etc. – A weight: e.g., 0.15 • Run on ego’s mobile devices, desktop, or on web • Processes user interactions – To reduce noise – To distinguish between routine and meaningful interactions 14 Social Sensors: Challenges • Identifying activity tags: – Mine text for keywords (emails, sms, blogs, etc) – Reverse geo-coding to find where (co)located – Predefined labels or dictionary and ontologies • Quantifying interactions (assigning weights): – Frequency, duration, time in-between interactions – Familiar strangers versus active social interaction 15 Work in Progress: Social Sensor for Gaming Interactions • Variability in playing habits • Variability in playing skills • Time patterns Aggregators • Act as the user’s personal assistant • Runs on trusted device (cell phone) • Responsible for – Managing passwords for various applications – Personalization – Identity management Carol Bob's Identity Manager Carol carol@work.com User1 User2 carol@home.com Alice's Identity Manager @carol_hates_alice The Social Hourglass Architecture Applications Sofa Surfer Roommate Finder CallCensor Applications make use Social Inference API social kno Social knowledge servic storing social data. Stor Management applications A1 A2 Social Data Personal Aggregators Personal aggregators c A3 sens S11 S21 S22 S32 S33 S43 SocialSensors sensors analyze Social socia Social signals Social Signals 18 Social Graph 19 Prometheus • Peer-to-peer architecture – Users contribute resources (peers) – Fundamental change from typical peer-to-peer networks: not every user has its peer • Input: Social information collected from different social sensors (reported via aggregators) • Output: Social information made available to applications and services – Information made available subject to user policies 20 Distributed Social Graph 21 Prometheus Architecture 23 Architecture Details • Users have a unique user ID • Select trusted peer group based on offline social trust with peer owners • A user’s trusted peers communicate via Scribe • Only the user’s trusted peers can decrypt user’s social data and thus perform social inference functions 24 Social Data Protection • 2 sets of public/private keys – User’s – User’s trusted peer group • Social sensors submit data encrypted with the group’s public key and signed with the user’s private key – Access to user’s private key only on user’s devices – Data stored in the Pastry overlay • Only trusted peers can decrypt and authenticate data 25 Social Inference Functions The social graph management service exports an API that implement social inferences 26 API for Applications: Social Inference Functions • 5 basic social inference functions: • relation_test (ego, alter, ɑ, w) • top_relations (ego, ɑ, n) • neighborhood (ego, ɑ, w, radius) • proximity (ego, ɑ, w, radius, distance) • social_strength (ego, alter) • More complex functions can be built 27 Social Strength • • • • Quantifies strength between ego and alter Result normalized to consider overall activity Search all paths of maximum 2 social hops One approach to quantify social strength. Others are certainly possible. 28 Lessons from Experiments on PlanetLab • Social-based mapping of users onto peers leads to significant performance gains: – More than 15% of requests finish faster – An order of magnitude fewer messages • Reasonable latency – Code significantly improved since publication in Middleware 2010 29 Experimental Results: Neighborhood Requests 10 users per peer 50 users per peer Prometheus: User-Controlled P2P Social Data Management for Socially-Aware Applications, Nicolas Kourtellis, Joshua Finnis, Paul Anderson, Jeremy Blackburn, Cristian Borcea, Adriana Iamnitchi. 11th International Middleware Conference, Bangalore, 30 India, November 2010. Real Social Traces: NJIT Social Graph 100 randomly selected students from NJIT given Bluetoothenabled phones that report their collocation • Data recorded – Collocation with two thresholds (45 and 90 minutes) – Facebook friendships • Sparse graph (commuters) 31 CallCensor • CallCensor implemented on Android – Cell phone silenced, rings or vibrates depending on the social context and relationship with caller – Relationship with caller: • Social strength > threshold: allow call • Caller directly connected by work • Caller connected by work and ≤ 2 hops away • Real social data from 100 users stored on 3 nodes from PlanetLab • Real time performance constraints 32 Lessons from CallCensor Experiments 33 Resilience to (Social) Attacks • Vulnerability to malicious users mitigated by directed, multi-edged, weighted social graph • Vulnerability to malicious peers related to social graph distribution • Peers gain the properties of the social graph they represent Summary • The social hourglass architecture • Prometheus: a decentralized service that enables socially-aware applications and services by collecting, managing and exposing social knowledge, subject to user-specified privacy policies. • Unique contributions: – – – – Social graph representation Aggregated social data Social inference functions Socially-aware design 35 Much Work to Be Done • Developing social sensors • Aggregator: – proof of concept implementation – Performance • Evaluating benefits of social knowledge in system design • Socially-aware applications • Query language for social inferences • Privacy protection 36 More Information • The Social Hourglass: an Infrastructure for Socially-aware Applications and Services, Iamnitchi et al., IEEE Internet Computing, May/June 2012 • Prometheus: User-Controlled P2P Social Data Management for Socially-Aware Applications, Kourtellis et al., Middleware 2010 • Vulnerability in Socially-Informed Peer-to-Peer System, Jeremy Blackburn, Nicolas Kourtellis, and Adriana Iamnitchi. Fourth Workshop on Social Network Systems (SNS 2011) http://www.cse.usf.edu/~anda anda@cse.usf.edu 37 Acknowledgements • My team of talented graduate students and alumni: • US National Science Foundation grants CNS0831785 and CNS-0952420 38 Thank you! 39 Neighborhood Inference NUMBER OF USERS RETURNED 100 90 CL.90 and FB 80 CL.45 and FB 70 CL.90 60 FB CL.90 or FB 50 40 CL.45 CL.45 or FB 30 20 10 0 1 2 3 4 5 6 SOCIAL HOPS FROM SOURCE 40 Social Strength Inference CL.45 or FB FB CL.90 CL.90 and FB 1.0 0.9 SOCS VALUE 0.8 CL.45 CL.90 or FB CL.45 and FB 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 1 2 SOCIAL HOPS FROM SOURCE 41 A Distributed System 42 42 Or a Distributed System 43 43 An Example: Interest Sharing “Yellow Submarine” “Les Bonbons” “No 24 in B minor, BWV 869” “Les Bonbons” “Yellow Submarine” “Wood Is a Pleasant Thing to Think About” “Wood Is a Pleasant Thing to Think About” The interest-sharing graph GmT(V, E): V is set of users active during interval T An edge in E connects users who share at least m file requests within T 44 Small Worlds Avg. path length ratio (log scale) . 10.0 Food web Power grid LANL coauthors Film actors Web 1.0 Internet Word co-occurrences 0.1 1 10 100 1000 10000 Clustering coefficient ratio (log scale) D. J. Watts and S. H. Strogatz, Collective dynamics of small-world networks. Nature, 393:440-442, 1998 45 R. Albert and A.-L. Barabási, Statistical mechanics of complex networks, R. Modern Physics 74, 47 (2002). Web Interest-Sharing Graphs Avg. path length ratio (log scale) . 10.0 300s, 1file Web data-sharing graph Other small-world graphs 1800s, 10file 7200s, 50files 1.0 1800s, 100files 3600s, 50files 0.1 1 10 100 1000 10000 Clustering coefficient ratio (log scale) 46 DØ Interest-Sharing Graphs Avg. path length ratio (log scale) . 10.0 Web data-sharing graph D0 data-sharing graph Other small-world graphs 1.0 28 days, 1 file 7days, 1file 0.1 1 10 100 1000 10000 Clustering coefficient ratio (log scale) 47 KaZaA Interest-Sharing Graphs Avg. path length ratio (log scale) . 10.0 Web data-sharing graph D0 data-sharing graph Other small-world graphs Kazaa data-sharing graph 2 hours 1 file 1.0 4h 2 files 28 days 12h 1 file 4 files 1 day 2 files 7day, 1file 0.1 1 10 100 1000 10000 Clustering coefficient ratio (log scale) 48 Proactive Information Dissemination 100 D0 Except largest cluster 90 Total hit rate 80 70 60 50 40 30 20 10 0 Web 3 days Except largest cluster Total hit rate 100 90 80 70 60 50 40 30 20 10 0 2 min 5 min 15 min 30 min 7 days Kazaa 10 days 14 days 21 days 28 days Except largest cluster Total hit rate 100 90 80 70 60 50 40 30 20 10 0 1 hour 4 hours 8 hours 49