Dealing with Churn

advertisement
A Self-repairing Peer-to-Peer Systems
Resilient to
Dynamic Adversarial Churn
Fabian Kuhn, Microsoft Research, Silicon Valley
Stefan Schmid, ETH Zurich
Roger Wattenhofer, ETH Zurich
Some slides taken from Stefan Schmid’s presentation of his
Masters thesis
Churn
Unlike servers, peers are transient!
– Machines are under the control of individual users
– e.g., just connecting to download one file
– Membership changes are called churn
join
leave
Successful P2P systems have to cope with churn
(i.e., guarantee correctness, efficiency, etc.)!
Churn characteristics
– Depends on application (Skype vs. eMule vs. …)
– But: there may be dozens of membership changes per second!
– Peers may crash without notice!
•
How can peers collaborate in spite of churn?
Churn threatens the advantages of P2P
a lot of churn
What can we guarantee in presence of churn?
We have to actively maintain P2P systems!
Goal of the paper
Only a small number of P2P systems have been analyzed under churn!
This paper presents techniques to:
- Provably maintain P2P systems with desirable properties…
- in spite of ongoing worst-case membership changes.
Peer degree, network diameter, …
Adversary continuously attacks the weakest part
(The system is never fully repaired, but always
fully functional)
How does Churn affect P2P systems?
1. Objects may be lost when the host crashes
2. Queries may not make it to the destination
Think about this
What is the big deal about churn? Does not every P2P system
define Join and Leave protocols?
Well, the system eventually recovers, but during recovery, services
may be affected. And objects not replicated are lost.
Observe the difference between non-masking and masking
fault tolerance. What we need is some form of masking tolerance.
Model for Dynamics
We assume worst-case perspective: Adversary A(J,L)
induces J joins and L leaves every round anywhere in
the system. We assume a synchronous model: time divided
into rounds.
Further refinement: Adversary A(J, L, r) implies
J joins, L leaves every r rounds
The topology is assumed to be a hypercube that has O(log n)
degree and O(log n) diameter.
Topology Maintenance
π1
•
π2
Challenges in maintaining the hypercube!
– How does peer 1 know that it should replace peer 2?
– How does it get there when there are concurrent joins and leaves?
– …
The Proposed Approach
Simple idea: Simulate the topology!
Several peers
per node
General Recipe for Robust Topologies
1. Take a graph with desirable properties
- Low diameter, low peer degree, etc.
2. Replace vertices by a set of peers
3. Maintain it:
a. Permanently run a peer distribution algorithm
which ensures that all vertices have roughly the same amount
of peers (“token distribution algorithm”).
b. Estimate the total number of peers in the system and change
“dimension of topology” accordingly
(“information aggregation algorithm” and “scaling algorithm”).
Resulting structure has similar properties as original graph
(e.g., connectivity, degree, …), but is also maintainable under churn!
There is always at least one peer per node (but not too many either).
Dynamic Token Distribution
U = 11010
V= 11011
a peers
b peers
W= 10010
After one step of recovery, both U and V will contain (a+b) /2 peers.
Try this once for each dimension of the hypercube
(dimension exchange method)
Theorem
Discrepancy is the maximum difference between the token count
of a pair of nodes. The goal is to reduce the discrepancy to 0. The
previous step reduces to 0 for fractional tokens, but for a
d-dimensional hypercube, using integer tokens,
= d in the worst case
In presence of an A(J,K,1) adversary, the proposed algorithm
maintains the invariance of
≤ 2J + 2K + d
Information aggregation
When the total number of peers N exceeds an upper bound, each node
splits into two, and the dimension of the hypercube has to increase by 1.
Similarly, when the total number of peers N falls below a lower bound,
pairs of nodes in dimension (d-1) merge into one, and the dimension of
the hypercube has to decrease by 1.
Thus, the system needs a mechanism to keep track of N.
Simulated hypercube
Given an adversary A (d+1, d+1, 6)*,
(1) the outdegree of every peer is bounded by
(2) The diameter is bounded by
(log2N), and
(log N)
* The adversary inserts and deletes at most (d+1) peers during
any time interval of 6 rounds
Topology
Only the core
peers store data
items.
Core
Despite churn, at
least one node in
each core has to
survive
periphery
Example topology for d=2. Peers in each core
are connected to one another and to the peers
of the core of the neighboring nodes
Q. What does the periphery node do?
6-round maintenance algorithm
The authors implied six rounds for one dimension in each phase
Round 1. Each node takes snapshot of active peers within itself.
Round 2. Exchange snapshot
Round 3. Preparation for peer migration
Round 4. Core send ids of new peers to periphery.
Reduce dimension if necessary.
Round 5. Dimension growth & building new core (2d+3)
Round 6. Exchange information about the new core.
Further improvement: Pancake Graph (1)
•
A robust system with degree and diameter O(log n / loglog n): the pancake graph
(most papers refer to Papadimitriou & Gates’ contribution here)!
•
Pancake of dimension d:
– d! nodes represented by unique permutation {l1, …, ld} where l1
– Two nodes u and v are adjacent iff u is a prefix-inversion of v
1234
4-dimensional pancake:
3214
4321
2134
{1,…,d}
The Pancake Graph (2)
• Properties
– Node degree O(log n / log log n)
– Diameter O(log n / log log n)
– … where n is the total number of nodes
– A factor log log n better than hypercube!
– But: difficult graph (diameter unknown!)
No other graph can have a smaller
degree and a smaller diameter!
Contributions
Using peer distribution and information aggregation algorithms
on the simulated pancake topology, he proposed:
• a DHT-based peer-to-peer system with
– Peer degree and lookup / network diameter in O (log n / loglog n)
– Robustness to ADV(O (log n / log log n), O (log n / log log n))
– No data is ever lost!
Asymptotically optimal!
The Pancake System
Conclusion
A nice model for understanding the effect of churn and
dealing with it. But it is too simplistic
Download