Self-organizing Wide-area Network Caches Self-organizing Wide-area Network Caches Samrat Bhattacharjee Ken Calvert Ellen Zegura Networking and Telecommunications Group College of Computing Georgia Institute of Technology Atlanta, Georgia, USA http://www.cc.gatech.edu/projects/canes Sponsors: DARPA, NSF 1 Self-organizing Wide-area Network Caches Conclusions Results Self-organizing caching schemes Caching approaches Problem statement Outline 2 Self-organizing Wide-area Network Caches 3 Problem Statement Improve client access latencies by associating caches with routers within the network Network Cache − caches reply − serves request #1 server request # 1 request # 0 Develop algorithms to co-ordinate network caches in order to reduce latency Assumptions { Request-response paradigm { Requests can be served by caches Components { Location of caches { What to cache and where { Rules for forwarding requests { Protocols to communicate between caches { Cache consistency Examples: Harvest, icp, Squid, Netcache Wide-area Caching Self-organizing Wide-area Network Caches Develop algorithms to coordinate caches to reduce latency 4 Self-organizing Wide-area Network Caches 5 Active Networks and Network Caching Active Networks provide a programmable network platform an active network active nodes can execute "user" algorithms active routers could execute new caching algorithms Use active networking to instantiate self-organizing cache management algorithms within the network Self-organizing Wide-area Network Caches 6 Caching Model and Topology Model Clients, Servers, Popular Items Transit-Stub Network Topologies transit domain T T T T T T transit router stub router stub domain hosts Transit−Stub Topology Model Self-organizing Wide-area Network Caches 7 Approaches towards Caching Caching by location { At transit nodes | Backbone routers { At border stub nodes | Access routers server T T T T T T transit caches stub connected to transit caches request Self-organizing Wide-area Network Caches 8 Self-organizing Network Caches Small caches without regard to location Eective use of widely distributed caches is dicult server active caches, run self−organizing algorithms request Challenges: Where to cache, how to forward requests? Self-organizing Wide-area Network Caches 9 Modulo Caching Spread item over client-server path Cache radius | modulo caching server 1 2 1 3 2 3 1 2 3 modulo caching radius = 3 1 request Item is cached once on server-client path every radius hops Self-organizing Wide-area Network Caches 10 Cache Lookaround Observation: Item location size average item size Use some item memory to store location of nearby items Trade cache memory for location information server lookaround caching 1 level of lookaround request Request may be deected (even o the shortest path to server) to caches within lookaround radius Self-organizing Wide-area Network Caches 11 Simulation Methodology Total cache size constant across methods Example : total number of cache slots in network is 12. S C T C S S S self−organizing schemes nominal cache size == cache size at each interface = 1 average cache size at each node ~ 1.72 S C T transit−only caching cache size at each interface = 6 average cache size at each node = 12 C S S S S C S S S C T stub connected to transit caching cache size at each interface = 3 average cache size at each node = 6 S stub node T transit node C stub node connected to transit node caching node Base Topology | 1500 nodes, average degree 3.71 { Average transit-only cache 25 times larger than average active cache, average SCT cache 4.25 times larger 20 25 30 35 Nominal Cache Size 40 Stub connected to Transit Transit Only Modulo Caching Modulo w/ 2 hop Lookaround Server Dist 0.8, Repeat Prob 0.1, Modulo Radius 3 15 No Cache RTL=13.23 10 45 10 9.5 9 8.5 8 7.5 5 10 15 40 Stub connected to Transit Transit Only Modulo Caching Modulo w/ 2 hop Lookaround 20 25 30 35 Nominal Cache Size No Cache RTL=13.24 Server Dist 0.8, Repeat Prob 0.5, Modulo Radius 3 Result: Varying Cache Size Self-organizing Wide-area Network Caches 11.5 11 10.5 10 9.5 9 8.5 8 5 Round Trip Latency 45 SCT caches are not able to reduce latency as well Self-organizing schemes perform better as cache size increases, and as access correlations increase Figure 1: Low Access Correlation Figure 2: High Access Correlation Round Trip Latency 12 Self-organizing Wide-area Network Caches 13 Round Trip Latency (No Cache: 12.75-13.3 Result: Varying Server Distribution Cache Size 16, Repeat Prob 0.5, Modulo Radius 3 9.5 9 8.5 8 7.5 Stub connected to Transit Transit Only Modulo Caching Modulo w/ 2 hop Lookaround 7 6.5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Server Distribution (Fraction of Total Nodes) Self-organizing schemes are robust against server distribution 16 Base Cache Size 08, Server Dist. 0.8, Repeat Prob. 0.3 Higher Degree Less S-Domain Lower Degree More S-Domain Topology No Cache No AN 15 Stub connected to Transit Transit Only Modulo Caching 14 Modulo w/ Lookaround Modulo w/ 2 hop Lookaround 13 12 11 10 9 8 Figure 3: Dierent Topologies 13.5 13 12.5 12 11.5 11 10.5 10 9.5 9 2 6 8 Number of Preferred Servers 10 No Cache No AN Stub conn. to Transit Transit Only Modulo Caching Modulo Caching w/ Lookaround Modulo Caching w/ 2 Hop Lookaround Cache Size 08, Sdist 0.8, Repeat Prob 0.3, Modulo Radius 3 4 12 Figure 4: Spatial Access Pattern Round Trip Latency (Hops) Result: Varying Topology and Access Patterns Self-organizing Wide-area Network Caches Round Trip Latency 14 Self-organizing Wide-area Network Caches 15 Analysis Probabilistic analysis of cache performance Expressions for latencies for various methods 14 Analysis, rtl=12 Analysis, rtl=14 Simulation, rtl=13.1 13 Round Trip Latency 12 11 10 9 8 7 6 5 4 0 100 200 300 400 500 600 Cache Size 700 800 900 1000 Comparison of analytic and simulation results | Lookaround Caches: Base Topology, 50 location pointers per item, 2 levels of lookaround Analysis of Lookaround Benet Self-organizing Wide-area Network Caches 400 500 0.1 0.2 0.3 0.4 0.9 0.8 0.7 0.6 0.5 Local Obj. Cache Fract. Partitioning of Cache between Local Objects and Location Caches space is partitioned into data and location pointers Question: What is the optimal amount to devote location pointers? Round Trip Latency 14 14.5 13 13.5 12 12.5 11.5 100 200 300 Cache Size Valley in plot due to benets of lookaround caching 16 Increase activity of in-network processing: { Utilize per-application, per-user semantics { Increase per-item state in the network Active Network implementation using ants Self-organizing schemes reduce latency using far smaller individual caches Self-organizing schemes are robust against varying access schemes, topologies, server distributions Self-organizing algorithms are an active network application that is independent of individual users Conclusions and Future Work Self-organizing Wide-area Network Caches 17