Turning Heterogeneity into an Advantage in Overlay Routing Zhichen Xu, Mallik Mahalingam, Magnus Karlsson Internet Systems and Storage Lab Hewlett-Packard Company INFOCOM 2003 Motivation • • • Distributed hash table (DHT) based overlay networks provide a simple abstraction that maps “keys” to “values” They are scalable, fault-tolerant, self-organizing and have low maintenance cost They can be used in many important applications, – • E.g., distributed storage, DNS, media streaming, web caching, contentbased searching, distributed firewalls, etc. As a result, these applications can benefit from the above properties Several proposals: Pastry, Tapestry, Chord, CAN, eCAN, SkipNet ... – – 7/26/2016 Provide a homogeneous abstraction to the applications, but vary in their logical structures and flexibility Routing is logical and at the application level IEEE INFOCOM 2003 Zhichen Xu page 2 Each logical hop can correspond to multiple physical hops 1 1 2 3 2 3 It is important that the structure of the overlay efficiently uses the underlying physical network! Images downloaded from http://www.mapresources.com/photoshop_maps/ 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 3 • Related work Within the overlay [Castro et al] – Proximity routing, e.g., Chord [Stoica et al] • Choices limited – Geographic layout, e.g., Topologicallyaware CAN [Ratnasamy et al] • uneven distribution of the nodes and • chance of overloading nodes – • Auxiliary networks, e.g. Brocade – Constructing a secondary overlay network, however current proposal – Still use logical routing in the secondary network – Pushes the problem to an auxiliary network of a smaller size – Dilemma in picking the size of the secondary network Proximity-neighbor selection, e.g., Pastry [Rowstron et al], eCAN [Xu and Zhang] • Routing table entries selected according to proximity metric among nodes that satisfy the constraint Performance constrained by the logical structure of the default overlay! 7/26/2016 IEEE INFOCOM 2003 Still logical routing! Zhichen Xu page 4 Our contributions • Decouple the homogeneous overlay abstraction from routing – Constructing unconstrained auxiliary routing network using • AS-level topology derived from BGP reports • Landmark-numbering scheme – • • Route advertisement using a distance vector algorithm with route summarization to reduce state Work with all currently existing overlays Simulation results show that our approach can achieve close to optimal routing performance – – 7/26/2016 1.04 to 1.12 times optimal routing for an Internet-like topology Previous approaches 2.5 to 5 times optimal for the same topology IEEE INFOCOM 2003 Zhichen Xu page 5 Expressway definitions High speed connections 1 2 An ordinary node establishes connection with the expressway node that is closest to it Expressway 4 1 3 2 Nodes with good connectivity and availability elect themselves as expressway nodes 3 4 7/26/2016 IEEE INFOCOM 2003 Default overlay, CAN as an example Zhichen Xu page 6 Expressway challenges • How does a node (ordinary or expressway) find the closeby expressway nodes? • How routes are propagated and how do we control the routing state? • What can the expressway be used for? 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 7 Landmark clustering Landmark3 Landmark space di: distance to landmark I <d1, d2, d3> Landmark1 Landmark vector Landmark2 • Related work – – 7/26/2016 Landmark ordering [Ratnasamy et al 2002]: Coordinate-based [Ng and Zhang 2001]: IEEE INFOCOM 2003 Nodes with similar distances to landmarks likely close to each other Zhichen Xu page 8 Locating close-by expressway node Landmark3 a b a b c DHT Landmark1 c Landmark2 • • • Landmark vector as key to store information of the expressway nodes on the DHT such that distances in the “landmark space” are preserved A node uses its landmark vector to search the DHT to find close-by nodes Expressway nodes finds and connects to physically close-by expressway nodes to form the expressway network 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 9 Dimensionality mismatch problem Landmark3 a b c Dimension reduction Landmark1 a b DHT c Landmark2 But, the dimensionality of the landmark space and that of the DHT is usually different 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 10 Space Filling Curves : Hilbert Curve • 2 3 8 7 1 4 5 6 Points close to each other in n-d space mapped to points close to each other in 1-d space, and vice versa 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 11 Proximity-preserving dimension reduction of landmark vectors : landmark numbering Landmark space 5 6 4 3 1 2 Default overlay CAN 2 7 1 3 7 8 4 5 6 Landmark number (a) 7/26/2016 IEEE INFOCOM 2003 (b) Zhichen Xu page 12 Route advertisement with summarization • An expressway node advertises all ordinary nodes that are in its physical proximity to neighboring expressway nodes – Given a destination, an expressway node returns the next hop expressway node on the shortest path – Uses a distance vector algorithm, except • advertise summarization of multiple nodes, and transport address of one representative node – Please read the paper for more detail • only expressway nodes participate in route advertisement • Route advertisement messages are controlled with a time-to-live (TTL) expressed as the number of expressway hops 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 13 Expressway Usages source • Direct route: – 1 2 • Expressway–node forwarding: – Direct route – 3 4 – • Requires slightly more storage space to keep the route summary and relies on IP routing If a node leaves the system, it is less expensive to repair May deliver routing performance better than default IP routing [RON 2001, Detour 1999] Nature for multicast Ordinary nodes cache addresses of nodes associated with the same expressway node dest 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 14 Experimental evaluation : 2-d eCAN as default overlay • Compare against – – • eCAN with roughly the same amount of state (50-75% better than basic CAN with similar state) Logical auxiliary: a Brocade-like system, but perf. >>that Of Brocade AS topology: – – • 1000 AS from a total of 13,000 active AS Assume 100 ms inter-AS delay and 10 ms intra-AS delay Transit-stub graph using GT-ITM: – – 7/26/2016 10,000 nodes, 228 transit domains 100ms for cross transit links, 20 ms for links inside a transit, 5 ms for links connecting a transit and stub node, and 2 ms for links inside a stub IEEE INFOCOM 2003 Zhichen Xu page 15 eCAN represents state-of-the-art • • • • • 7/26/2016 IEEE INFOCOM 2003 High-order routing tables are softstate, therefore it has a lower maintenance cost than that of CAN of a high-dimension Allows for proximity-neighbor selection Neighbor selection based on landmark clustering & controlled data placement 1-d eCAN is topology-aware Chord The notion of “high order” zones allows for controlled server and data placement for locality preservation Zhichen Xu page 16 Parameters used • • # of nodes: 512-8K (4K as default) Fraction of nodes that are expressway nodes: 1/1-1/64 (1/10 as default) • stretch = routing delay / shortest-path delay 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 17 Comparison of various approaches 4.6 4.2 7 AS topology Transit-stub graph 6.5 6 3.8 5.5 5 Stretch Stretch 3.4 3 2.6 Logical auxiliary eCAN (w. same state) Logical auxiliary (advertising) Exp (AS) Exp (landmark) Exp (AS+landmark) 2.2 1.8 1.4 1 512 • 4 3.5 3 Logical auxiliary eCAN (w. same state) Logical auxiliary (advertising) Exp (landmark) 2.5 2 1.5 1 1K 2K Number of nodes • 4.5 4K 8K 512 1K 2K 4K 8K Number of nodes Our approach: 1.02 to 1.5 times of optimal Other approaches: 2.5 to 6.6 times of optimal 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 18 Effect of varying the ratio of expressway nodes in the system 8 7 Logical auxiliary Logical auxiliary (advertising) Exp (landmark) eCAN (w. same state) Exp (landmark, forward) Stretch 6 5 4 3 2 1 "1/64 1/32 1/16 1/8 1/4 1/2 1 Percentage of expressway nodes • • As the percentage of expressway nodes increases, expressway better approximates the underlying physical network Whereas a “logical auxiliary” cannot take advantage of this 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 19 Conclusions • • Propose generic techniques to construct an auxiliary routing network for DHT-based overlays Decouples routing from DHT abstraction to take advantage of the heterogeneity that exists in the system Achieves routing performance close to optimal • The expressway nodes need to be relatively stable • 7/26/2016 IEEE INFOCOM 2003 Zhichen Xu page 20 Other projects using a DHT • • • • • 7/26/2016 eCAN, a hierarchical version of CAN Content-based search on DHT [HotNets’02] pFilter: global data filtering and dissemination [FTDCS’03] Scalable multicast trees [NOSDAV’03] Sedar: semantic, deep archival system [FTDCS’03] IEEE INFOCOM 2003 Zhichen Xu page 21