middleware10 - Department of Computer Science • NJIT

advertisement
ACM/IFIP/USENIX 11th International Middleware Conference, 2010
Prometheus: User-Controlled P2P
Social Data Management
for Socially-aware Applications
Nicolas Kourtellis, Joshua Finnis,
Paul Anderson, Jeremy Blackburn,
Cristian Borcea*, Adriana Iamnitchi
Department of Computer Science and Engineering, USF
*Department of Computer Science, NJIT
Social and Socially-aware Applications
Applications may contain user profiles, social networks,
history of social interactions, location, collocation
2
Problems with Current Social
Information Management
 Application specific:
 Need to input data for each new application
 Cannot benefit from information
aggregation across applications
 Typically, data are owned by applications:
users don't have control over their data
 Hidden incentives to have many "friends":
social information not accurate
3
Our Solution: Prometheus
 P2P social data management service:
 Receives data from social sensors that collect
application-specific social information
 Represents social data as decentralized social graph
 Exposes API to share social information with
applications according to user access control policies
SOCIAL
SENSORS
SOCIALLYAWARE APPS
PROMETHEUS
Loopt
`
`
`
`
`
`
`
CallCensor
Foursquare
4
Outline







Motivation
Social Graph Management
API and Access Control
Prototype Implementation
Evaluation over PlanetLab
Summary
Future Work
5
How is the Social Graph Populated?
 Social sensors report edge information to
Prometheus:
<ego, alter, activity, weight>
 Applications installed by user on personal devices
 Aggregate & analyze history of user's interactions with
other users
 Two types of social ties:
 Object-centric: use of similar resources
 Examples: tagging communities on Delicious,
repeatedly being parts of the same BitTorrent swarms
 People-centric: pair-wise or group relationships
 Examples: friends on Facebook, same company name
6
on LinkedIn, collocation from mobile phones
Social Graph Representation
 Multi-edged, directed, weighted, labeled graph
 Each edge → a reported social activity
 Weight → interaction intensity
 Directionality reflects reality
 Allows for fine-grain privacy
 Prevents social data manipulation
7
Decentralized Graph Storage
 Each user has a set of trusted peers in the P2P network
 Peers it owns & peers owned by trusted users
 Each user’s sub-graph stored on all its trusted peers
 Improved availability in face of P2P churn
 P2P multicast used to synchronize information among
trusted peers
1,2
B
1
1,2
C
2
1,2,3
D
3
2,3,4,5
E
4
3,4
F
5
3,5
>
.1
,0
>
c
i
.2
0
,
us
l
l
<m tba
o
o
<f
A
<m
u
<f
oo sic,
0.
<m tba
us ll,0 1>
i c,
.3
>
0.
25
>
B
C
<f
<f ootb
oo a
tb ll,0
all .2
,0 >
.1
>
.
,0
si c
25
>
>
.3
u
l,0 .2>
l
a
<m
tb ,0
oo ing
f
<
ik
<h
<m
D
>
us
.3
<
0
ic,
fo
,
>
c
3
o
i
0.
.
<h tb
1>
,0 5>
us
l
a
l
i
k
l
m
l
2
a
,
i
.
ng 0 .
<
b ,0
t
,0 3>
oo g
.3
< f i ki n
>
h
<
1>
0.
,
c
2>
i
0.
us
l,
l
m
a
<
tb
oo
f
<
E
A
F
<
<f mus
oo
i
<m tb c, 0
us all, .1>
0
ic,
0. .3>
25
>
B
<music,0.15>
<fo
<football,0.3>
---
<music,0.15>
<music,0.2>
A
PEER 1
ALL PEERS
Trust
Peer
<football,0.3>
Owns
Peer
<music,0.2>
User
ID
E
otb
all,
0 .2
>
>
0.3
ic,
s
u
.3 >
<m
ll ,0 >
a
otb
.25
<fo ing,0
ik
<h
D
C
8
Encrypted P2P Storage
 Sensor data stored encrypted in P2P network
 Improves availability and protects privacy
 Sensors encrypt data with trusted group public key &
sign with user private key
 Trusted peers retrieve user data, decrypt it, & create
social graph
User
Public Key
Private Key
Group
Public Key
Private Key
9
Outline







Motivation
Social Graph Management
API and Access Control
Prototype Implementation
Evaluation over PlanetLab
Summary
Future Work
10
Prometheus Application Interface
 Five social inference functions:
 Boolean relation_test (ego, alter, ɑ, w)
 User-List top_relations (ego, ɑ, k)
 User-List neighborhood (ego, ɑ, w, radius)
 User-List proximity (ego, ɑ, w, radius, distance)
 Double social_strength (ego, alter)
 Ego & alter don’t have to be directly connected
 Normalized result: consider ego’s overall activity
 Search all 2-hop paths
11
Application Example: CallCensor
 Socially-aware incoming call filtering
 Ring/vibrate/silence phone based on current social
context and relationship with caller
 Invokes
 proximity() to determine current social context
 social_strength() to determine relationship with caller
12
Request Execution: social_strength()
1st hop
1st hop
2nd hop
1.
2.
3.
4.
5.
6.
7.
Application sends request to a peer
Peer forwards request to trusted peer
Trusted peer enforces ACPs
Trusted peer sends secondary requests
Trusted peers enforce ACPs & reply
Primary peer combines results
Primary peer replies to application
through contacted peer with final result
13
Access Control Policies
 User specifies ACPs upon registration
 ACPs stored on user’s trusted peer group
 Update them at any time
 Changes propagated through multicast mechanism
 Applied for each inference request
 Control relations, labels, weights & locations
Example:
Alice’s ACPs
relations:
hops-2
hiking-label: lbl-hiking
work-label: lbl-work
general-label:
--weights:
--location:
hops-1
blacklist:
user-Eve
14
Outline







Motivation
Social Graph Management
API and Access Control
Prototype Implementation
Evaluation over PlanetLab
Summary
Future Work
15
Prototype Implementation
 FreePastry Java implementation with support for
 DHT (Pastry)
 P2P storage (Past)
 Multicast (Scribe)
 Social graph management implemented in Python
16
Evaluation over PlanetLab
 Goals:
1. Assess performance under realistic network
conditions (peers distributed around the world)
2. Assess performance at large scale using realistic
workloads with large number of users
3. Assess the effect of socially-aware mapping of
users onto trusted peers on system’s performance
4. Validate Prometheus with socially-aware
application under real-time constraints (CallCensor)
 Metric: end-to-end response time
17
Large-Scale Evaluation Setup
 100 PCs around the globe
 RTT~200-300ms
 1000 users: synthetic social graph
 Random vs. socially-aware trusted peer assignment
 10 & 30 users assigned per peer
 Workloads for:
 Social sensor inputs based on Facebook study
 Neighborhood requests based on Twitter study
 Social strength requests based on BitTorrent study
 Applied a timeout of 15 seconds to fulfill a 1-hop
request in PlanetLab
18
Neighborhood Request Results
Neighborhood Requests (10 users/peer)
Neighborhood Requests (30 users/peer)
1
1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
CDF
CDF
random - 1hop
0.6
random - 1hop
0.5
random - 2hop
0.5
random - 2hop
0.4
random - 3hop
0.4
random - 3hop
0.3
social - 1hop
0.3
social - 1hop
0.2
social - 2hop
0.2
social - 2hop
0.1
social - 3hop
0
0.1
social - 3hop
0
0
10,000
20,000
30,000
End-to-end response time (msecs)
0
10000
20000
30000
End-to-end response time (msecs)
 Socially-aware assignment of users onto peers results in faster
response time
 Message overhead reduced by an order of magnitude
 Replication for improved availability does not induce high overhead
19
Social Strength Request Results
Social Strength Requests
1.0
0.9
0.8
0.7
CDF
0.6
0.5
0.4
random - 10 users per peer
0.3
random - 30 users per peer
0.2
social - 10 users per peer
0.1
social - 30 users per peer
0.0
0
10000
20000
Average end-to-end response time (msecs)
 Similar performance with 2-hop Neighborhood Requests
 Search all 2-hop paths from source to destination
20
CallCensor Evaluation Setup
 CallCensor implemented and
tested on Nexus Android phone
 100 users: real social graph
 Volunteer students from NJIT
 Two social sensors
 Collocation from Bluetooth
 45 & 90 minutes threshold
 Friendship from Facebook
 3 USA PlanetLab peers
 Socially-aware trusted peer
assignment
21
CallCensor Results
 Met real-time performance constraint: response arrives before
call forwarded automatically to voicemail
22
Summary
 Users of Prometheus:
 Decide what personal social data are collected by
installing/configuring social sensors
 Cooperate to store and manage their social data in
a decentralized fashion
 Own and control access to their data
 Prometheus enables:
 Socially-aware applications that utilize social data
collected from multiple sources
 Accurate social world representation through multiedged, labeled, directed and weighted graph
 Improved performance through socially-aware P2P
system design
23
Future Work
 Improve Prometheus performance
 Network optimizations
 Caching of inference request results
 Develop new social sensors
 Develop new socially-aware applications &
services
 Study tolerance to malicious attacks
 Exposure of social information to
intermediate peers during request execution
 Manipulation of social connections to alter
the structure of the social graph
24
Thank you!
This work was supported by NSF Grants:
CNS 0952420, CNS 0831785, CNS 0831753
http://www.cse.usf.edu/dsg/mobius
nkourtel@mail.usf.edu
25
Why P2P?
 1st alternative: Free Centralized Service
 No incentives or business model for free storage
and service of encrypted data
 2nd alternative: Cloud
 Cost for transferring and storing data
 Tradeoff between privacy & inference functionality
 2nd alternative: mobile phones
 Limited energy and computation power
 Not always online (service unavailability)
 Not always synchronized, for fast and efficient
inference support
26
Prometheus vs. Facebook?
 Both collect social information of users from multiple
sources but:
 Facebook is limited to input from Facebook-controlled
sources
 Prometheus accepts input from any user-defined social
source (sensor)
 User-control of social information
 Prometheus allows full user-control:
 Storage of data
 Exposure of data to users, applications & services
 Facebook allows very limited user-control:
 Exposure of data to users, applications & services*
 Always at odds with its business model
27
Updating the Social Graph
 Social data for each user stored as append-only
file in P2P network
 Atomic appends using lock file for
synchronization
 Trusted peers periodically check for new inputs
for a user
 May have inconsistent data for short time periods
 Not major problem: social graphs do not change
frequently
 After authentication, new input is merged with
the social graph of the relevant user
28
Social Sensors: Challenges
 Identifying activity tags:
 Mine text for keywords (emails, sms, blogs,...)
 Reverse geo-coding to find where (co)located
 Predefined labels or dictionaries and ontologies
 Quantifying interactions (assigning weights):
 Frequency, duration, time in-between interactions
 Familiar strangers versus active social interactions
29
Related Work


SONAR: aggregation of social information only within an enterprise
context (emails, IM, etc) to improve information flow
RE: 2-hop relationships to automatically populate email white-lists;







Prometheus: can extract social knowledge from larger portions of the
graph than direct or 2-hop neighborhood
Social information and requests can cross application boundary
contexts
Persona: Attribute based encryption of data for sharing between apps
while applying fine-grained access policies from users
PeerSoN: direct data exchange between users’ devices

Prometheus: trusted peers reliably store & exchange social data

Prometheus: social incentives in trusted peer selection to reduce churn

Prometheus: fully decentralized on P2P network
Vis-á-Vis: store data on Virtual Independent Servers on the cloud to
deal with churn
MobiSoc: logically centralized -> “big brother concerns”
MobiClique: Delay tolerant networking middleware for disseminating
social information
30
Download