REDUNDANCY IN NETWORK TRAFFIC: FINDINGS AND IMPLICATIONS Ashok Anand Chitra Muthukrishnan Aditya Akella University of Wisconsin, Madison Ramachandran Ramjee Microsoft Research Lab, India Redundancy in network traffic 2 Redundancy in network traffic Popular objects, partial content matches, headers Redundancy elimination (RE) for improving network efficiency Application Web Recent WAN layer object caching proxy caches protocol independent RE approaches optimizers, De-duplication, WAN Backups, etc. Protocol independent RE 3 WAN link Message granularity: packet or object chunk Different RE systems operate at different granularity RE applications 4 Enterprise and data centers Accelerate WAN performance As a primitive in network architecture Packet Caches [Sigcomm 2008] Ditto [Mobicom 2008] Protocol independent RE in enterprises 5 Data centers Globalized enterprise dilemma Centralized servers ISP Wan Opt Direct request to closest servers Complex management RE gives benefits of both worlds Enterprises Distributed servers Wan Opt Simple management Hit on performance Deployed in network middle-boxes Accelerate WAN traffic while keeping management simple RE for accelerating WAN backup applications Recent proposals for protocol independent RE 6 Web content Reduce load on ISP access links Packet caches [Sigcomm 2008] ISP RE deployment on ISP access links to improve capacity University RE on all routers Ditto [Mobicom 2008] Enterprises Improve effective capacity Use RE on nodes in wireless mesh networks to improve throughput Understanding protocol independent RE systems 7 Currently little insight into these RE systems How far are these RE techniques from optimal? Are there other better schemes? When is network RE most effective? Do end-to-end RE approaches offer performance close to network RE? What fundamental redundancy patterns drive the design and bound the effectiveness? Important for effective design of current systems as well as future architectures e.g. Ditto, packet caches Large scale trace-driven study 8 First comprehensive study Performance comparison of different RE algorithms Average bandwidth savings Bandwidth savings in peak and 95th percentile utilization Impact on burstiness Origins of redundancy Traces from multiple vantage points Focus on packet level redundancy elimination Intra-user vs. Inter-user Different protocols Patterns of redundancy Distribution of match lengths Hit distribution Temporal locality of matches Data sets 9 Enterprise packet traces (3 TB) with payload 11 enterprises Small (10-50 IPs) Medium (50-100 IPs) Large (100+ IPs) 2 weeks Protocol composition HTTP (20-55%) Spring et al. (64%) File sharing (25-70%) Centralization of servers UW Madison packet traces (1.6 TB) with payload 10000 IPs; trace collected at campus border router Outgoing /24, web server traffic 2 different periods of 2 days each Protocol composition Incoming, HTTP 60% Outgoing, HTTP 36% Evaluation methodology 10 Emulate memory-bound (500 MB - 4GB) WAN optimizer Entire cache resides in DRAM (packet-level RE) Emulate only redundancy elimination Deployment across both ends of access links WAN optimizers do other optimizations also Enterprise to data center All traffic from University to one ISP Replay packet trace Compute bandwidth savings as (saved bytes/total bytes) Includes packet headers in total bytes Includes overhead of shim headers used for encoding Large scale trace-driven study 11 Performance comparison of different RE algorithms Origins of redundancy Patterns of redundancy Distribution of match lengths Hit distribution Redundancy elimination algorithms 12 Redundancy elimination algorithms Redundancy suppression across different packets (Use history) MODP (Spring et al.) MAXP (new algorithm) Data compression only within packets (No history) GZIP and other variants MODP 13 Spring et al. [Sigcomm 2000] Compute fingerprints Fingerprint table Packet payload Window Payload-1 Rabin fingerprinting Payload-2 Value sampling: sample those fingerprints whose value is 0 mod p Packet store Lookup fingerprints in Fingerprint table MAXP 14 Similar to MODP Only selection criteria changes MODP Sample those fingerprints whose value is 0 mod p No fingerprint to represent the shaded region MAXP Choose fingerprints that are local maxima ( or minima) for p bytes region Gives uniform selection of fingerprints Optimal 15 Approximate upper bound on optimal Store every fingerprint in a bloom filter Identify fingerprint match if bloom filter contains the fingerprint Low false positive for bloom filter: 0.1% Comparison of MODP, MAXP and optimal 16 MAXP Optimal 70 60 50 40 30 128 32 16 Fingerprint sampling period(p) 8 4 MAXP outperforms MODP by 5-10% in most cases 64 Bandwidth savings(%) MODP Uniform sampling approach of MAXP MODP loses due to non uniform clustering of fingerprints New RE algorithm which performs better than classical MODP Comparison of different RE algorithms 17 GZIP (10 ms)->GZIP MAXP MAXP->(10ms)->GZIP Bandwidth savings(%) 70 60 50 40 30 20 10 0 Small Large Univ/24 Univ-out (10ms buffering) -> GZIP increases benefit up to 5% MAXP significantly outperforms GZIP, offers 15-60% bandwidth savings Medium GZIP offers 3-15% benefit -> means followed by MAXP -> (10 ms) -> GZIP further enhances benefit up to 8% We can use combination of RE algorithms to enhance the bandwidth savings Large scale trace-driven study 18 Performance study of different RE algorithms Origins of redundancy Patterns of redundancy Distribution of match lengths Match distribution Origins of redundancy 19 Different users accessing the same content, or same content being accessed repeatedly by same user? Middle-box deployments can eliminate bytes shared across users How much sharing across users in practice? INTER-USER: sharing across users (a) INTER-SRC (b) INTER-DEST (c) INTER-NODE Flow-1 Flow-2 Flow-3 INTRA-USER: redundancy within same user (a) INTRA-FLOW (b) INTER-FLOW Flow-1 Flow-2 Flow-3 Data Centers Enterprise Middlebox Middlebox Study of composition of redundancy 20 100 90 80 70 60 50 40 30 20 10 0 intersrc internode interdst interflow Small Medium Large UIn UOut intraflow UOut/24 Contribution to Savings (%) Inter User Intra User 90% savings is across destinations for Uout/24 For Uin/Uout, 30-40% savings is due to intra-user For enterprises, 75-90% savings is due to intra-user Implication: End-to-end RE as a promising alternative 21 Data Centers Enterprise Middlebox End-to-end RE as a compelling design choice Similar savings Deployment requires just software upgrade Middle-boxes are expensive Middle-boxes may violate end-to-end semantics Middlebox Large scale trace-driven study 22 Performance study of different RE algorithms End-to-end RE versus network RE Patterns of redundancy Distribution of match lengths Hit distribution Match length analysis 23 Do most of the savings come from full packet matches? Simple technique of indexing full packet will be good For partial packet matches, what should be the minimum window size? Match length analysis for enterprise 24 Percentage Match length distribution Contribution to savings 80 70 60 50 40 30 20 10 0 Bins of different match lengths (in bytes) 70% of the matches are less than 150 bytes and contribute 20% of savings 10% of the matches come from full matches and contribute 50% of savings Need to index small chunks of size <= 150 bytes for maximum benefit Hit distribution 25 Contributors of redundancy Few pieces of content repeated multiple times Small Many packet store would be sufficient pieces of content repeated few times Large packet store Zipf-like distribution for chunk matches 26 Chunk ranking Unique chunk matches sorted by their hit counts Straight line shows the zip-fian distribution Similar to web page access frequency How much popular chunks contribute to savings? Savings due to hit distribution 27 80% of savings come from 20% of chunks Need to index 80% of chunks for remaining 20% of savings Diminishing return for cache size Savings vs. cache size 28 Savings (%) Small Medium Large 45 40 35 30 25 20 15 10 5 0 0 300 600 900 1200 1500 Cache size (MB) Small packet caches (250 MB) provide significant percentage of savings Diminishing returns for increasing packet cache size after 250 MB Conclusion 29 First comprehensive study of protocol independent RE systems Key Results 15-60% savings using protocol independent RE A new RE algorithm, which performs 5-10% better than Spring et al. approach Zip-fian distribution of chunk hits; small caches are sufficient to extract most of the redundancy End-to-end RE solutions are promising alternatives to memory-bound WAN optimizers for enterprises 30 Questions ? Thank you! 31 Backup slides Peak and th 95 percentile savings 32 Mean Median 95%tile Peak 60 Savings (%) 50 40 30 20 10 0 1 10 100 1000 Time (seconds) 10000 100000 Effect on burstiness 33 Wavelet based multi-resolution analysis Energy plot higher energy means more burstiness Compared with uniform compression Results Enterprise No reduction in burstiness Peak savings lower than average savings University Reduction in burstiness Positive correlation of link utilization with redundancy Redundancy across protocols 34 Large enterprise Protocol Percentage Volume Percentage redundancy HTTP 16.8 29.5 SMB 45.46 21.4 LDAP 4.85 44.33 Src code ctrl 17.96 50.32 Protocol Percentage Volume Percentage redundancy HTTP 58 12.49 DNS 0.22 21.39 RTSP 3.38 2 FTP 0.04 16.93 University