Structure Management for Scalable Overlay Service Construction Kai Shen Department of Computer Science University of Rochester 6/20/2016 NSDI'04 1 Motivations Structure: set of overlay links that data flow through link selection is important for performance low latency, high bandwidth, … link selection can be costly large selection base high cost of link property probing Existing link selection are mostly service-specific unicast overlay path selection (e.g., RON) end-system multicast (e.g., Narada, Overcast, and NICE) substrate-aware DHT (e.g., CAN, Chord, and Pastry) 6/20/2016 NSDI'04 2 A Common Structure Layer A service-independent structure layer: Saxons Substrate-Aware Connectivity Support for Overlay Network Services Potential benefits: simplify service design & implementation modularity allow runtime overhead sharing across multiple services (not yet addressed in this paper) Questions: Performance? How can services utilize a common structure layer? 6/20/2016 NSDI'04 3 Design Objectives A common structure layer must meet the quality requirements of a wide range of services overlay latency hop-count distance overlay bandwidth: on the shortest path, or on the widest path Best effort: no guarantee on structure quality Other design objectives: scalability extremely simple API stability 6/20/2016 NSDI'04 4 Saxons Design Overview Structure quality maintenance Node bootstrap Like property probing Membership management Partition detection & repair Scalability functional-symmetric architecture per-node management overhead only depends on the number of attached links; not the overlay size do not maintain complete system view at any single node 6/20/2016 NSDI'04 5 Structure Quality Maintenance Configurable node degree range <dactive, dtotal> High-level description periodically check random links; replace existing ones if better employ adjustment threshold to avoid oscillation Three quality maintenance approaches AllShort: maintain all short links tend to create grid-like structure ⇒ high hop-count distance ( n vs. O(log n) produced by random structure) ShortLong [Ratnasamy et al 2002]: half short, half random links ShortWide: half short, half wide links (high adj. threshold) 6/20/2016 NSDI'04 6 Random Membership Subsets Membership subset service dynamically changing subsets with uniform randomness for tree-like overlay structures [Kostić et al 2003] Each node maintains a member-subset Periodically, each node informs its neighbors a randomly selected update-set To ensure equal representation the node itself is selected into each update-set at probability: (update-set size) / (overlay size) 6/20/2016 NSDI'04 7 Implementation Saxons runtime prototype Basic API for overlay applications: stand-alone daemon communicating with local overlay application instances through IPC; or linked and run inside the application process space directly query the Saxons runtime for directly attached links provide a callback function to the Saxons runtime, invoked by Saxons whenever neighbor links change Advanced API: control protocol parameters 6/20/2016 NSDI'04 8 Link Bandwidth Measurement Requirements: robustness, overhead, accuracy Many techniques were proposed in the past Our goal: a simple scheme that works based on the packet bunch [Carter&Crovella 1996] All-to-all measurement results on 61 Planetlab sites: Bandwidth (in Mbps) 10 10 10 6/20/2016 2 4.8MB/measurement 480KB/measurement 1 0 NSDI'04 9 Evaluation Simulation evaluation on large-scale overlays (up to 12,800 nodes) use 3 kinds of Internet backbones BGP routing dumps from NLANR and RouteViews synthetic backbones generated using Inet and GT-ITM based on all-to-all measurement results from NLANR AMP PlanetLab experimentation performance assessment on a particular real-world environment most nodes are on Internet2 most nodes have 10Mbps bandwidth limit 6/20/2016 NSDI'04 10 Overall Structure Quality (55 PlanetLab sites) CDF of overlay path latency CDF of widest path bandwidth 100% 100% 80% 80% 60% 60% Random AllShort ShortLong ShortWide (Saxons) 40% 20% 0% 0 100 200 Latency (in millisecond) 300 40% 20% 0% 1.25 2.5 5 10 20 40 80 Bandwidth (in Mbps) All three schemes outperform Random by over 18% on latency ShortWide provides >10Mbps bandwidth for over 3 times more site pairs 6/20/2016 NSDI'04 11 Structure Stability During Node Churn (55 PlanetLab sites) Adjustment per hour per node Overlay link adjustment during node join/departure 60 50 40 ¬ All sites complete bootstrap 30 20 10 0 ¬ 5 sites fail ¬ Site #1 rejoins ¬ Site #2 rejoins ¬ Site #3 rejoins ¬ Site #4 rejoins ¬ Site #5 rejoins 0 20 40 60 80 Time after all sites have joined (in minutes) 100 120 Five nodes fail at the 60th minute and rejoin one by one at threeminute intervals Small disturbances as the result of node join/departure 6/20/2016 NSDI'04 12 Saxons-based Overlay Multicast (52 PlanetLab sites) Bandwidth for 1.2 Mbps stream Bandwidth for 2.4Mbps stream 2.5 1 0.8 0.6 0.4 Multicast over Random Multicast over Saxons Independent direct unicast 0.2 0 0 10 20 30 Rank 40 Bandwidth (in Mbps) Bandwidth (in Mbps) 1.2 2 1.5 1 0.5 0 0 50 10 20 30 Rank 40 50 Compared with Random, Saxons-based multicast provides smallloss (<5%) data delivery to over 4 times more receivers Performance close to Independent Direct Unicast 6/20/2016 NSDI'04 13 Related Work Structure-first overlay multicast: Narada [Chu et al 2000] Utilities/infrastructures for overlay service construction: Service-specific link selection: Topology probing [Nakao et al 2003], MACEDON [Rodriguez et al 2004] Unicast routing: RON [Andersen et al 2001], [Savage et al 1999] Multicast routing: Narada [Chu et al 2001], Overcast [Jannotti et al 2000], NICE [Banerjee et al 2002] Substrate-aware DHT: Binning [Ratnasamy et al 2002], Brocade [Zhao et al 2002], Pastry [Castro et al 2002] Related work for various Saxons components Membership management: [Kostić et al 2003], lpbcast [Eugster et al 2003] Bandwidth measurement: [Carter&Crovella 1996], [Paxson 1997], Scalable latency estimation: [Hotz 1994], IDMaps [Francis et al 1999], [Lai&Baker 2000] GNP [Ng&Zhang 2002] 6/20/2016 NSDI'04 14 Conclusion and Future Work Saxons - a common structure management layer supporting scalable overlay service construction simplify the construction of many overlay services still allow many services (e.g., overlay multicast, Gnutella-style query flooding, DHT) to achieve high-performance Future work: support runtime overhead sharing when overlay nodes host multiple services best effort structure quality → soft structure quality guarantee 6/20/2016 NSDI'04 15