Packet Caches on Routers: The Implications of Universal Redundant Traffic Elimination Ashok Anand, Archit Gupta, Aditya Akella University of Wisconsin, Madison Srinivasan Seshan Carnegie Mellon University Scott Shenker University of California, Berkeley 1 Redundant Traffic in the Internet • Lots of redundant traffic in the Internet • Redundancy due to… Same content traversing same set of links Time T + 5 Time T – Identical objects – Partial content match (e.g. page banners) – Applicationheaders –… 2 Redundancy Elimination • Object-level caching – Application layer approaches like Web proxy caches – Store static objects in local cache – [Summary Cache: SIGCOMM 98, Co-operative Caching: SOSP 99] • Packet-level caching – [Spring et. al: SIGCOMM 00] – WAN Optimization Products: Riverbed, Peribit, Packeteer, .. Access link Enterprise Internet Packet-Cache Packet-Cache Packet-level caching is better than object-level caching 3 Benefits of Redundancy Elimination – – – – Reduces bandwidth usage cost Reduces network congestion at access links Higher throughputs Reduces in transfer completion times 4 Towards Universal RE • However, existing RE approaches apply only to point deployments – E.g. at stub network access links, or between branch offices • They only benefit the system to which they are directly connected. • Why not make RE a native network service that everyone can use? 5 Our Contribution • Universal redundancy elimination on routers is beneficial • Re-designing the routing protocol to be redundancy aware gives furthermore benefits • Practical to implement redundancy elimination 6 Universal Redundancy Elimination At All Routers Wisconsin Packet cache at every router Total packets with universal RE= 12 (ignoring tiny packets) Upstream router removes redundant bytes. Total packets w/o RE = 18 Downstream router reconstructs full packet 33% CMU Internet2 Berkeley 7 Benefits of Universal Redundancy Elimination • Subsumes benefits of point deployments • Also benefits Internet Service Providers – Reduces total traffic carried better traffic engineering – Better responsiveness to sudden overload (e.g. flash crowds) • Re-design network protocols with redundancy elimination in mind Further enhance the benefits of universal RE 8 Redundancy-Aware Routing Wisconsin Total packets with RE = 12 ISP needs information of traffic similarity between CMU and Berkeley ISP needs to compute redundancyaware routes Total packets with RE + routing= 10 (Further 20% benefit ) 45% CMU Berkeley 9 Redundancy-Aware Routing • Intra-domain Routing for ISP • Every N minutes – Each border router computes a redundancy profile for the first Ts of the N-minute interval • Estimates how traffic is replicated across other border routers • High speed algorithm for computing profiles – Centrally compute redundancy-aware routes • Route traffic for next N minutes on redundancy-aware routes. • Redundancy elimination is applied hop-by-hop 10 Redundancy Profile Example Wisconsin Dataunique,pitsburgh= 30 KB Dataunique,Berkeley= 30 KB Datashared= 20 KB TotalCMU= 50 KB TotalBerkeley= 50 KB Internet2 CMU Berkeley 11 Centralized Route Computation Centralized Platform Route computation • Linear Program • Objective: minimize the total traffic footprint on ISP links • Traffic footprint on each link as latency of link times total unique content carried by the link • Compute narrow, deep trees which aggregate redundant traffic as much as possible • Impose flow conservation and capacity constraints 12 Inter-domain Routing • ISP selects neighbor AS and the border router for each destination • Goal: minimize impact of inter-domain traffic on intradomain links and peering links. • Challenges: – Need to consider AS relationships, peering locations, route announcements – Compute redundancy profiles across destination ASes • Details in paper 13 Trace-Based Evaluation • Trace-based study – RE + Routing: Redundancy aware routing – RE: Shortest path routing with redundancy elimination – Baseline: Compared against shortest path routing without redundancy elimination • Packet traces – Collected at University of Wisconsin access link – Separately captured the outgoing traffic from separate group of high volume Web servers in University of Wisconsin • Represents moderate-sized data center • Rocketfuel ISP topologies • Results for intra-domain routing on Web server trace 14 Benefits in Total Network Footprint Fraction of Border Routers RE RE +Routing 1 0.8 0.6 0.4 0.2 0 0.1 0.3 Reduction in Network Footprint • Average redundancy of this Web server trace is 50% using 2GB cache • ATT topology • 2GB cache per router • CDF of reduction in network footprint across border routers of ATT • RE gives reduction of 1035% • (RE + Routing) gives reduction of 20-45% 15 When is RE + Routing Beneficial? • Topology effect – E.g., multiple multi-hop paths between pairs of border routers • Redundancy profile – Lot of duplication across border routers 16 Synthetic Trace Based Study • Synthetic trace for covering wide-range of situations – Duplicates striped across border routers in ISP (inter-flow redundancy) – Low striping across border routers , but high redundancy with in traffic to a border router (intra-flow-redundancy) – Understand topology effect 17 Benefits in Total Network Footprint Reduction in Network Footprint RE RE+Routing 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.5 1 Intra-flow redundancy • Synthetic trace, average redundancy = 50% • ATT (7018) topology • Trace is assumed to enter at Seattle • RE + Routing, is close to RE at high intra-flow redundancy, 50% benefit • RE has benefit of 8% at zero intra-flow redundancy • RE + Routing, gets benefit of 26% at zero intra-flow redundancy. 18 Benefits in Max Link Utilization • Link capacities either 2.5 or 10 Gbps 0.4 0.35 • Comparison against 0.3 traditional OSPF based 0.25 traffic engineering (SP0.2 MaxLoad) 0.15 0.1 • RE offers 1-25% lower 0.05 maximum link load . 0 • RE + Routing offers 10-37% (0.2,1.0) (0.5,0.5) lower maximum link load. (Overall redundancy, Inter flow Reduction in Max Link Utilization RE RE + Routing redundancy) Max link Utilization = 80%, for SP-MaxLoad 19 Evaluation Summary • RE significantly reduces network footprint • RE significantly improves traffic engineering objectives • RE + Routing further enhances these benefits • Highly beneficial for flash crowd situations • Highly beneficial in inter-domain traffic engineering 20 Implementing RE on Routers Fingerprint s • Main operations – Fingerprint computation • Easy, can be done with CRC – Memory operations, Read and Write Fingerprint table Packet store 21 High Speed Implementation • Reduced the number of memory operations per packet – Fixed number of fingerprints (<10 per packet) – Used lazy invalidation of fingerprint for packet eviction – Other optimizations in paper • Click-based software prototype runs at 2.3 Gbps (approx. OC 48 speed ). 22 Summary • RE at every router is beneficial ( 10-50%) • Further benefits (10-25%) from redesigning routing protocol to be redundancy-aware. • OC48 speed attainable in software 23 Thank you 24 Backup 25 Flash Crowd Simulation SP-MaxLoad SP-RE RA Max Link Utilization 1 • Flash Crowd: Volume increases at one of the border routers – Redundancy ( 20% -> 50%) – Inter Redundancy Fraction (0.5 -> 0.75) – Max Link Utilization without RE is 50% 0.9 0.8 0.7 0.6 0.5 0.4 0.3 1 3 Volume Increment Factor • Traditional OSPF traffic engineering gets links at 95% utilization at volume increment factor > 3.5 • Whereas SP-RE at 85% , and RA further lower at 75% 26 Impact of Stale Redundancy Profile SP-RE RA RA-stale • RA relies on redundancy profiles. 0.5 0.45 Reduction in Network Footprint 0.4 • How stable are these redundancy profiles ? 0.35 0.3 0.25 • Used same profile to compute the reduction in network footprint at later times ( with in an hour) 0.2 0.15 0.1 0.05 0 1 2 3 4 High Volume /24 traces 5 • RA-stale is quite close to RA 27 High Speed Implementation • Use specialized hardware for fingerprint computation • Reduced the number of memory operations per packet – Number of memory operations is function of number of fingerprints. Fixed the number of sampled fingerprints – During evicting packet, explicit invalidating fingerprint require memory operations. Used lazy invalidation • Fingerprint pointer is checked for validation as well as existence. • Store packet-table and fingerprint-table in DRAM for high-speed – Used Cuckoo Hash-table. As simple-hash based fingerprint table is too large to fit in DRAM 28 Base Implementation Details (Spring et. al) • Compute fingerprints per packet and sample them • Insert packet into packet store • Check for existence of fingerprint pointer to any packet, for match detection. • Encode the match region in the packet. • Insert each fingerprint into Fingerprint table. • As store becomes full, evict the packet in FIFO manner • As a packet gets evicted, invalidate its corresponding fingerprint pointers 29