Structure Management for Scalable Overlay Service Construction Kai Shen

advertisement
Structure Management for Scalable
Overlay Service Construction
Kai Shen
Department of Computer Science
University of Rochester
6/20/2016
NSDI'04
1
Motivations

Structure: set of overlay links that data flow through



link selection is important for performance
 low latency, high bandwidth, …
link selection can be costly
 large selection base
 high cost of link property probing
Existing link selection are mostly service-specific



unicast overlay path selection (e.g., RON)
end-system multicast (e.g., Narada, Overcast, and NICE)
substrate-aware DHT (e.g., CAN, Chord, and Pastry)
6/20/2016
NSDI'04
2
A Common Structure Layer

A service-independent structure layer: Saxons


Substrate-Aware Connectivity Support for Overlay Network
Services
Potential benefits:



simplify service design & implementation
modularity
allow runtime overhead sharing across multiple services (not
yet addressed in this paper)

Questions:


Performance?
How can services utilize a common structure layer?
6/20/2016
NSDI'04
3
Design Objectives

A common structure layer must meet the quality
requirements of a wide range of services
overlay latency
 hop-count distance
 overlay bandwidth: on the shortest path, or on the widest path
Best effort: no guarantee on structure quality


Other design objectives:



scalability
extremely simple API
stability
6/20/2016
NSDI'04
4
Saxons Design Overview

Structure quality
maintenance
Node
bootstrap
Like property
probing
Membership
management
Partition
detection & repair
Scalability



functional-symmetric architecture
per-node management overhead only depends on the number
of attached links; not the overlay size
do not maintain complete system view at any single node
6/20/2016
NSDI'04
5
Structure Quality Maintenance


Configurable node degree range <dactive, dtotal>
High-level description



periodically check random links; replace existing ones if better
employ adjustment threshold to avoid oscillation
Three quality maintenance approaches

AllShort: maintain all short links
 tend to create grid-like structure ⇒ high hop-count distance
(  n vs. O(log n) produced by random structure)
ShortLong [Ratnasamy et al 2002]: half short, half random links
ShortWide: half short, half wide links (high adj. threshold)
 


6/20/2016
NSDI'04
6
Random Membership Subsets

Membership subset service





dynamically changing subsets with uniform randomness
for tree-like overlay structures [Kostić et al 2003]
Each node maintains a member-subset
Periodically, each node informs its neighbors a randomly
selected update-set
To ensure equal representation

the node itself is selected into each update-set at probability:
(update-set size) / (overlay size)
6/20/2016
NSDI'04
7
Implementation

Saxons runtime prototype



Basic API for overlay applications:



stand-alone daemon communicating with local overlay
application instances through IPC; or
linked and run inside the application process space
directly query the Saxons runtime for directly attached links
provide a callback function to the Saxons runtime, invoked by
Saxons whenever neighbor links change
Advanced API:

control protocol parameters
6/20/2016
NSDI'04
8
Link Bandwidth Measurement



Requirements: robustness, overhead, accuracy
Many techniques were proposed in the past
Our goal: a simple scheme that works

based on the packet bunch [Carter&Crovella 1996]
All-to-all measurement
results on 61 Planetlab
sites:
Bandwidth (in Mbps)
10
10
10
6/20/2016
2
4.8MB/measurement
480KB/measurement
1
0
NSDI'04
9
Evaluation

Simulation



evaluation on large-scale overlays (up to 12,800 nodes)
use 3 kinds of Internet backbones
 BGP routing dumps from NLANR and RouteViews
 synthetic backbones generated using Inet and GT-ITM
 based on all-to-all measurement results from NLANR AMP
PlanetLab experimentation



performance assessment on a particular real-world environment
most nodes are on Internet2
most nodes have 10Mbps bandwidth limit
6/20/2016
NSDI'04
10
Overall Structure Quality
(55 PlanetLab sites)
CDF of overlay path latency
CDF of widest path bandwidth
100%
100%
80%
80%
60%
60%
Random
AllShort
ShortLong
ShortWide (Saxons)
40%
20%
0%
0


100
200
Latency (in millisecond)
300
40%
20%
0%
1.25 2.5
5
10
20
40
80
Bandwidth (in Mbps)
All three schemes outperform Random by over 18% on latency
ShortWide provides >10Mbps bandwidth for over 3 times more
site pairs
6/20/2016
NSDI'04
11
Structure Stability During Node Churn
(55 PlanetLab sites)
Adjustment per hour per node
Overlay link adjustment during node join/departure
60
50
40 ¬ All sites complete bootstrap
30
20
10
0


¬ 5 sites fail
¬ Site #1 rejoins
¬ Site #2 rejoins
¬ Site #3 rejoins
¬ Site #4 rejoins
¬ Site #5 rejoins
0
20
40
60
80
Time after all sites have joined (in minutes)
100
120
Five nodes fail at the 60th minute and rejoin one by one at threeminute intervals
Small disturbances as the result of node join/departure
6/20/2016
NSDI'04
12
Saxons-based Overlay Multicast
(52 PlanetLab sites)
Bandwidth for 1.2 Mbps stream
Bandwidth for 2.4Mbps stream
2.5
1
0.8
0.6
0.4
Multicast over Random
Multicast over Saxons
Independent direct unicast
0.2
0
0


10
20
30
Rank
40
Bandwidth (in Mbps)
Bandwidth (in Mbps)
1.2
2
1.5
1
0.5
0
0
50
10
20
30
Rank
40
50
Compared with Random, Saxons-based multicast provides smallloss (<5%) data delivery to over 4 times more receivers
Performance close to Independent Direct Unicast
6/20/2016
NSDI'04
13
Related Work

Structure-first overlay multicast: Narada [Chu et al 2000]

Utilities/infrastructures for overlay service construction:


Service-specific link selection:




Topology probing [Nakao et al 2003], MACEDON [Rodriguez et al 2004]
Unicast routing: RON [Andersen et al 2001], [Savage et al 1999]
Multicast routing: Narada [Chu et al 2001], Overcast [Jannotti et al 2000],
NICE [Banerjee et al 2002]
Substrate-aware DHT: Binning [Ratnasamy et al 2002], Brocade [Zhao et al
2002], Pastry [Castro et al 2002]
Related work for various Saxons components

Membership management: [Kostić et al 2003], lpbcast [Eugster et al 2003]
Bandwidth measurement: [Carter&Crovella 1996], [Paxson 1997],

Scalable latency estimation: [Hotz 1994], IDMaps [Francis et al 1999],

[Lai&Baker 2000]
GNP [Ng&Zhang 2002]
6/20/2016
NSDI'04
14
Conclusion and Future Work

Saxons - a common structure management layer
supporting scalable overlay service construction



simplify the construction of many overlay services
still allow many services (e.g., overlay multicast, Gnutella-style
query flooding, DHT) to achieve high-performance
Future work:


support runtime overhead sharing when overlay nodes host
multiple services
best effort structure quality → soft structure quality guarantee
6/20/2016
NSDI'04
15
Download