A Blueprint for Constructing Peer-to-Peer Systems Robust to Dynamic Worst-Case Joins and Leaves Fabian Kuhn, Microsoft Research, Silicon Valley Stefan Schmid, ETH Zurich Joest Smit, ETH Zurich Roger Wattenhofer, ETH Zurich 14th IEEE Int. Workshop on Quality of Service (IWQoS) Yale University, New Haven, CT, USA, June 2006 Brief Intro to Peer-to-Peer Computing (1) P2P computing = power by accumulating distributed resources (CPU cycles, disk space, …) Peer-to-Peer Decentralized („all machines“) Scalable Efficient … Client / Server Centralized („one machine“) Bottleneck Single Point of Failure … vs Stefan Schmid, ETH Zurich @ IWQoS 2006 2 Brief Intro to Peer-to-Peer Computing (2) • Examples: - computing power (Folding@Home, …) - file sharing (eMule, Kangoo, …) - internet telephony (Skype, …) - media streaming (Swistry, …) distributed computations file sharing (live media streaming) Stefan Schmid, ETH Zurich Swistry @ IWQoS 2006 3 Churn (1) • But: unlike server, peers are transient! – Machines under control of individual users – E.g., just connecting to download one file – Membership changes are called churn • • Successful P2P systems have to cope with churn (i.e., guarantee correctness, efficiency, etc.)! Stefan Schmid, ETH Zurich @ IWQoS 2006 4 Churn (2) • Dynamic resources: A challenge in P2P computing! • Churn characteristics: – Depends on application (Skype vs. eMule vs. …) – But: There may be dozens of membership changes per second! – Peers may crash without notice! • How can peers collaborate in spite of churn? Stefan Schmid, ETH Zurich @ IWQoS 2006 5 Churn (3) • Churn is important, as it threatens “advantages of P2P computing”! a lot of churn • We have to actively maintain P2P systems! Stefan Schmid, ETH Zurich @ IWQoS 2006 6 Our Paper… • Unfortunately, only few P2P systems have been analyzed under churn! • Our paper… Peer degree, network diameter, … … presents techniques to: - … build and provably maintain P2P systems with desirable properties… - … in spite of ongoing worst-case membership changes. „adversary“ non-stop attacks weakest part (system „never fully repaired, but always fully functional“) Stefan Schmid, ETH Zurich @ IWQoS 2006 7 Talk Outline • A model for dynamics • Overview of techniques • Example: A robust system with degree and diameter O(log n / loglog n) • Conclusion Stefan Schmid, ETH Zurich @ IWQoS 2006 8 Talk Outline • A model for dynamics • Overview of techniques • Example: A robust system with degree and diameter O(log n / loglog n) • Conclusion Stefan Schmid, ETH Zurich @ IWQoS 2006 9 Model for Dynamics • Churn = possibly concurrent membership changes, at any time! • We assume worst-case perspective: Adversary ADV(J,L) – i.e., joins and leaves may take place at the weakest spot of the network • Synchronous model: time divided into rounds (e.g., max round trip time) time ADV(J,L): In each round, at most J peers may joins and at most L peers leave (crash). Stefan Schmid, ETH Zurich @ IWQoS 2006 10 Talk Outline • A model for dynamics • Overview of techniques • Example: A robust system with degree and diameter O(log n / loglog n) • Conclusion Stefan Schmid, ETH Zurich @ IWQoS 2006 11 Talk Outline • A model for dynamics • Overview of techniques • Example: A robust system with degree and diameter O(log n / loglog n) • Conclusion Stefan Schmid, ETH Zurich @ IWQoS 2006 12 Topology Maintenance • An efficient P2P topology under churn: π1 • π2 Almost impossible to maintain the hypercube! – How does peer 1 know that it should replace peer 2? – How does it get there when there are concurrent joins and leaves? – … • Is there a more robust topology but • Stefan withSchmid, sameETH small Zurichdegree @ IWQoSand 2006 diameter? 13 Our Approach Simple idea: Simulate the topology! Stefan Schmid, ETH Zurich @ IWQoS 2006 14 General Recipe for Robust Topologies 1. Take a graph with desirable properties - Low diameter, low peer degree, etc. 2. Replace vertices by a set of peers 3. Maintain it: a. Permanently run a peer distribution algorithm which ensures that all vertices have roughly the same amount of peers (“token distribution algorithm”). b. Estimate the total number of peers in the system and change “dimension of topology” accordingly (“information aggregation algorithm” and “scaling algorithm”). Resulting structure has similar properties as original graph (e.g., connectivity, degree, …), but is also maintainable under churn! There is always at least one peer per node (but not too many either). Stefan Schmid, ETH Zurich @ IWQoS 2006 15 Talk Outline • A model for dynamics • Overview of techniques • Example: A robust system with degree and diameter O(log n / loglog n) • Conclusion Stefan Schmid, ETH Zurich @ IWQoS 2006 16 Talk Outline • A model for dynamics • Overview of techniques • Example: A robust system with degree and diameter O(log n / loglog n) • Conclusion Stefan Schmid, ETH Zurich @ IWQoS 2006 17 The Pancake Graph (1) • A robust system with degree and diameter O(log n / loglog n): the pancake graph – E.g., Papadimitriou & Gates! • Pancake of dimension d: – d! nodes represented by unique permutation {l1, …, ld} of set {1,…,d} – Two nodes u and v are adjacent iff u is a prefix-inversion of v 1234 4-dimensional pancake: 3214 4321 2134 Stefan Schmid, ETH Zurich @ IWQoS 2006 18 The Pancake Graph (2) • Properties – Node degree Θ (log n / loglog n) – Diameter Θ (log n / loglog n) – … where n is the total number of nodes – A factor loglog n better than hypercube! – But: difficult graph (diameter unknown!) No other graph can have a smaller degree and a smaller diameter! Stefan Schmid, ETH Zurich @ IWQoS 2006 19 Contribution • Using peer distribution and information aggregation algorithms… • … on the simulated pancake topology, we can construct: • a peer-to-peer system (“distributed hash table”) with – Peer degree and lookup / network diameter in Θ (log n / loglog n) – Robustness to ADV(Θ (log n / loglog n), Θ (log n / loglog n)) – No data is ever lost! Asymptotically optimal! Stefan Schmid, ETH Zurich @ IWQoS 2006 20 The Pancake System Stefan Schmid, ETH Zurich @ IWQoS 2006 21 Stefan Schmid, ETH Zurich @ IWQoS 2006 22 Basic Components • Peer Distribution Algorithm – Balance peers between neighboring nodes – One (pancake-) dimension after the other! • Information Aggregation Algorithm – Exploit recursive structure of pancake – Aggregate „sub-pancakes“ with increasing order Both happens concurrently to ongoing churn! If fast enough, pancake is maintained! Always at least one peer per node! Stefan Schmid, ETH Zurich @ IWQoS 2006 23 Internals (1) • • How are peers connected in the simulated topology? Idea: Matching Clique • Clique Problem: - There are up to Θ ((log n / loglog n)2) many peers in each node - Clique would render peer degree too large! Inside node, peers have to form a grid! Stefan Schmid, ETH Zurich @ IWQoS 2006 24 Internals (2) • Solution: Matching Grid • • Grid Each peer is connected to all peers which are either in the same row or column Degree is OK now, and still robust enough to churn! Stefan Schmid, ETH Zurich @ IWQoS 2006 25 Internals (3) • “Distributed Hash Table”: - Stores data at nodes - But on which peers of node of given ID? - On just one is bad in dynamic enviroment! • All? - Possible! - But much data movement during peer distribution. • Better idea: - Peers of a node fall into two categories: Protons and Electrons - Protons = „core peers“, store data, are „seldom“ used during token distribution - Electrons = „peripheral peers“, do not store data, are used for balancing - Make sure that there are always enough protons (no data loss)! Stefan Schmid, ETH Zurich @ IWQoS 2006 26 Talk Outline • A model for dynamics • Overview of techniques • Example: A robust system with degree and diameter O(log n / loglog n) • Conclusion Stefan Schmid, ETH Zurich @ IWQoS 2006 27 Talk Outline • A model for dynamics • Overview of techniques • Example: A robust system with degree and diameter O(log n / loglog n) • Conclusion Stefan Schmid, ETH Zurich @ IWQoS 2006 28 Conclusion • Contribution: A scheme to maintain quality of a peer-to-peer system in spite of worstcase membership changes. – – • • Simulated graph can have similar properties as base graph. – Degree, diameter, etc. – May require some additional thinking, though! (e.g., grid) A peer-to-peer system with degree and diameter in O(log n/loglog n) which tolerates O(log n/loglog n) joins and leaves per round. – – • Ingredients: “base graph”, token distribution & information aggregation algorithm Proofs possible! Better than often-used hypercube graph! But: difficult graph! (e.g., dimension change) Open questions – – How to coordinate dynamic peers or resources: An exciting field of research! E.g.: Self-stabilization, dirty leaves, etc. Stefan Schmid, ETH Zurich @ IWQoS 2006 29 Questions and Feedback? Thank you for your attention! Stefan Schmid Distributed Computing Group schmiste@ethz.ch http://dcg.ethz.ch/members/stefan.html Stefan Schmid, ETH Zurich @ IWQoS 2006 30