Defending Against Sybil Attacks via Social Networks Haifeng Yu School of Computing National University of Singapore Acknowledgments Talk based on three papers [SIGCOMM’06, ToN’08] (SybilGuard) [IEEE S&P’08] (SybilLimit) Available on my homepage – google my name Co-authors: Phillip B. Gibbons Michael Kaminsky Feng Xiao Abie Flaxman Haifeng Yu, National University of Singapore 2 Background: Sybil Attack Sybil attack: Single user pretends many fake/sybil identities I.e., Creating multiple accounts honest malicious Already observed in real-world p2p systems launch sybil attack Sybil identities can become a large fraction of all identities Haifeng Yu, National University of Singapore 3 Background: Sybil Attack Enables malicious users to easily “out-vote” honest users Byzantine consensus – exceed the 1/3 threshold Majority voting – cast more than one vote DHT – control a large portion of the ring Recommendation systems – manipulate the recommendations Haifeng Yu, National University of Singapore 4 Background: Defending Against Sybil Attack Using trusted central authority to tie identities to human beings – not always desirable Much harder without a trusted central authority [Douceur’02] Resource challenges not sufficient IP address-based approach not sufficient Widely considered as real & challenging: Over 40 papers acknowledging the problem of sybil attack, without having a distributed solution Haifeng Yu, National University of Singapore 5 SybilGuard / SybilLimit Basic Insight: Leveraging Social Networks SybilGuard / SybilLimit is the first to use social networks for thwarting sybil attacks with provable guarantees. Nodes = identities Undirected edges = strong mutual trust E.g., colleagues, relatives in real-world Not online friends! Haifeng Yu, National University of Singapore 6 SybilGuard / SybilLimit Basic Insight n honest users: One identity/node each Malicious users: Multiple identities each (sybil nodes) sybil nodes honest nodes attack edges sybil nodes may collude – the adversary malicious users Observation: Adversary cannot create extra edges between honest nodes and sybil nodes Haifeng Yu, National University of Singapore 7 SybilGuard/SybilLimit Basic Insight Dis-proportionally small cut disconnecting a large number of identities But cannot search brute-force… attack edges honest nodes sybil nodes Haifeng Yu, National University of Singapore 8 SybilGuard / SybilLimit End Guarantees Completely decentralized Enables any given verifier node to decide whether to accept any given suspect node Accept: Provide service to / receive service from Ideally: Accept and only accept honest nodes – unfortunately not possible SybilGuard / SybilLimit provably Bound # of accepted sybil nodes (w.h.p.) Accept all honest nodes except a small fraction (w.h.p.) Haifeng Yu, National University of Singapore 9 Example Application Scenarios If # of sybil nodes accepted <n Then applications can do majority voting < n/2 byzantine consensus < n/c for some constant c secure DHT [Awerbuch’06, Castro’02, Fiat’05] … Haifeng Yu, National University of Singapore … 10 SybilGuard vs. SybilLimit # sybil nodes accepted (smaller is better) per attack edge total number of attack edges g g O n / log n g between SybilGuard [SIGCOMM’06] n / log n and On / log n ( n log n) ~2000 unbounded SybilLimit [Oakland’08] (log n) ~10 (log n) ~10 We also prove that SybilLimit is O (log n) away from optimal Haifeng Yu, National University of Singapore 11 Outline Motivation, basic insight, and end guarantees SybilLimit design Will focus on intuition Evaluation results on real-world social networks Haifeng Yu, National University of Singapore 12 Cryptographic Keys Each edge in social network corresponds to a symmetric edge key Established out of band Each node (honest or sybil) has a locally generated public/private key pair “Identity”: V accepts S = V accepts S’s public key KS When running SybilLimit, every suspect S is allowed to “register” KS on some other nodes Haifeng Yu, National University of Singapore 13 SybilLimit: Strawman Design – Step 1 Ensure that sybil nodes (collectively) register only on limited number of honest nodes Still provide enough “registration opportunities” for honest nodes K: registered keys of sybil nodes K: registered keys of honest nodes K K K K K K K K K K K K K K K K honest region sybil region Haifeng Yu, National University of Singapore 14 SybilLimit: Strawman Design – Step 2 K: registered keys of sybil nodes K: registered keys of honest nodes Accept S iff KS is register on sufficiently many honest nodes Without knowing where the honest region is ! Circular design? We can break this circle… K K K K K K K K K K K K K K K K honest region sybil region Haifeng Yu, National University of Singapore 15 Three Interrelated Key Techniques Technique 1: Use the tails of random routes for registration Will achieve Step 1 SybilGuard novelty: Random routes SybilLimit novelty: The use of tails SybilLimit novelty: The use of multiple independent instances of shorter random routes Haifeng Yu, National University of Singapore 16 Three Interrelated Key Techniques Technique 2: Use intersection condition and balance condition to verify suspects Will break the circular design and achieve Step 2 SybilGuard novelty: Intersection on nodes SybilLimit novelty: Intersection on edges SybilLimit novelty: Balance condition Technique 3: Use benchmarking technique to estimate unknown parameters Breaks another seemingly circular design… SybilLimit novelty: Benchmarking technique Haifeng Yu, National University of Singapore 17 Random Route: Convergence f a b ad randomized b a routing table c b dc d c de ed f f e Random 1 to 1 mapping between incoming edge and outgoing edge Using routing table gives Convergence Property: Routes merge if crossing the same edge Haifeng Yu, National University of Singapore 18 Securely Registering Public Keys edge “CD” is the tail of A’s random route A B C D i=1 KA i=2 KA i=3 KA i=3 KA record KA under name “CD” To register KA, A initiates a random route (assuming w = 3) All random routes in SybilLimit are of length w All nodes know w Nodes communicate via authenticated channels Haifeng Yu, National University of Singapore 19 Tails of Sybil Suspects Imagine that every sybil suspect initiates a random route from itself tainted tail sybil nodes honest nodes total 1 tainted tail Haifeng Yu, National University of Singapore 20 Counting The Number of Tainted Tails attack edge honest nodes sybil nodes Claim: There are at most w tainted tails per attack edge Proof: By the Convergence property Regardless of whether sybil nodes follow the protocol Haifeng Yu, National University of Singapore 21 Back to the Strawman Design Step 1 # of K ’s gw Independent of # sybil nodes # of K ’s n – gw From “backtrace-ability” property of random routes See paper… K: registered keys of sybil nodes K: registered keys of honest nodes K K K honest region Step 1 achieved ! Haifeng Yu, National University of Singapore K K K K 22 Independent Instances SybilLimit uses m independent instances of the registration protocol m: # of edges in the honest region m Number of K’s: (n g w) m Number of K’s: g w Goal: Accept S iff KS is registered on m tails in the honest region Sybil suspects accepted: g w Honest suspects accepted: n g w Haifeng Yu, National University of Singapore 23 Three Techniques Technique 1: Use novel random routes to register public keys Will achieve Step 1 Technique 2: Use intersection condition and balance condition to verify suspects Challenge: SybilLimit does not know which region is the honest region Technique 3: Use benchmarking technique to estimate unknown parameters Haifeng Yu, National University of Singapore 24 The Intersection Condition Verifier V obtains m tails by doing m random routes of length w Using different instances – see paper… Some tails are in the sybil region – ignore for now… S satisfies intersection condition if: S’s and V’s tails intersect S’s public key is registered with the intersecting tail Haifeng Yu, National University of Singapore 25 Intersection Condition: Verification Procedure AB 1. request S’s set of tails 2. I have three tails AB; CD; EF V S 3.common tail: EF 4. Is KS registered? EF CD F 5. Yes. S satisfies intersection condition 4 messages involved Haifeng Yu, National University of Singapore 26 Leveraging Known Random Walk Theory (Approximate) Theorem: If w is roughly the mixing time of the social network, then all tails (V’s and S’s) are roughly uniformly random edges If social networks have O (log n) mixing time, then w O(log n) Haifeng Yu, National University of Singapore 27 Leveraging a Sharp Distribution Assuming V has m tails in the honest region Intersection prob p Help to bound # of sybil nodes accepted m p 1 p0 This is why SybilLimit does edge intersection … 0 m 1.0 m Haifeng Yu, National University of Singapore Birthday paradox # of S’s tails in honest region 28 Back to the Strawman Design Step 2 K: registered keys of sybil nodes K: registered keys of honest nodes Accept S iff KS is register on sufficiently many honest nodes “Sufficiently many” = K m K K Intersection occurs iff S has m tails in the honest region K K K K K K K K K K K K K honest region sybil region Haifeng Yu, National University of Singapore 29 Omitted Challenges … Some of V’s tails are in the sybil region We do not know which tails are in the sybil region Balance condition – hardest part to prove in SybilLimit… Adversary has many strategies to allocated the tainted tails… Tainted tails are not uniformly random… See paper for details… Haifeng Yu, National University of Singapore 30 Three Interrelated Key Techniques Technique 1: Random routes Technique 2: Intersection condition and balance condition Technique 3: Novel and counter-intuitive benchmarking technique Avoids another seemingly circular design… See paper… Claims on near-optimality: See paper… Haifeng Yu, National University of Singapore 31 Performance Aspects Random routes are performed only once Re-do only when social network changes – infrequently Can be done incrementally Doing random routes is not time-critical Only delays a new suspect being accepted Churn is a non-problem… Verification involves O(1) messages See paper… Haifeng Yu, National University of Singapore 32 Outline Motivation, basic insight, and end guarantees SybilLimit design Evaluation results on real-world social networks Haifeng Yu, National University of Singapore 33 Validation on Real-World Social Networks SybilGuard / SybilLimit assumption: Honest nodes are not behind disproportionally small cuts Rigorously: Social networks (without sybil nodes) have small mixing time Mixing time affects # sybil nodes accepted Synthetic social networks – proof in [SIGCOMM’06] Real-world social networks? Social communities, social groups, …. Haifeng Yu, National University of Singapore 34 Simulation Setup Crawled online social networks used in experiments # nodes # edges Friendster 0.9M 7.8M Livejournal 0.9M 8.7M DBLP 0.1M 0.6M We experiment with: Different number and placement of attack edges Different graph sizes -- full size to 100-node sub-graphs Sybil attackers use the optimal strategy Haifeng Yu, National University of Singapore 35 Brief Summary of Simulation Results In all cases we experimented with: Average honest verifier accepts ~95% of all honest suspects Average honest suspect is accepted by ~95% of all honest verifiers # sybil nodes accepted: ~10 per attack edge for Friendster and LiveJournal ~15 per attack edge for DBLP Haifeng Yu, National University of Singapore 36 Other Social Networks? Other social networks likely to have small mixing time too (DBLP as a worst-case) What if the mixing time is large? Graceful degradation of SybilLimit’s guarantees -Accept more sybil nodes Haifeng Yu, National University of Singapore 37 Conclusions Sybil attack: Widely considered as a real and challenging problem SybilLimit: Fully decentralized defense protocol based on social networks Provable near-optimal guarantees Experimental validation on real-world social networks Future work: Implement SybilLimit with real apps Haifeng Yu, National University of Singapore 38 Post Doc Opening NUS: Ranked 31st globally by Newsweek E.g., we have 11 SIGMOD papers in 2008 I have post doc opening in distributed systems and distributed algorithms Minimum 1 year, renewable up to multiple years 2 years funding already committed Main job duty: Publish in top venues Help you to build up track record for career after post doc Salary: Comparable (if not better) than US post docs Singapore living cost and tax are lower than US Contact me to inquire or apply – google my name Haifeng Yu, National University of Singapore 39