BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University minlanyu@cs.princeton.edu Joint work with Alex Fabrikant, Jennifer Rexford 1 Challenges in Edge Networks • Edge networks – Enterprise and data center networks • Large number of hosts – Tens or even hundreds of thousands of hosts • Dynamic hosts – Host mobility, virtual machine migration • Cost conscious – Equipment costs, network-management costs Need an easy-to-manage and scalable network 2 Flat Addresses • Self-configuration – Hosts: MAC addresses – Switches: self-learning • Simple host mobility – Flat addresses are location-independent • But, poor scalability and performance – Flooding frames to unknown destinations – Inefficient delivery of frames over spanning tree – Large forwarding tables (one entry per address) 3 Improving Performance and Scalability • Recent proposals on improving control plane – TRILL (IETF), SEATTLE (Sigcomm’08) • Scaling information dissemination – Distribute topology and host information intelligently rather than flooding • Improving routing performance – Use shortest path routing rather than spanning tree • Scaling with large forwarding tables – This talk: BUFFALO 4 Large-scale SPAF Networks • SPAF:Shortest Path on Addresses that are Flat – Shortest paths: scalability and performance – Flat addresses: self-configuration and mobility H S S H H H S S S S S H H S S H S S S S H H • Data plane scalability challenge – Forwarding table growth (in # of hosts and switches) – Increasing link speed 5 State of the Art • Hash table in SRAM to store forwarding table – Map MAC addresses to next hop – Hash collisions: extra delay 00:11:22:33:44:55 00:11:22:33:44:66 …… aa:11:22:33:44:77 • Overprovision to avoid running out of memory – Perform poorly when out of memory – Difficult and expensive to upgrade memory 6 Bloom Filters • Bloom filters in fast memory (SRAM) – A compact data structure for a set of elements – Calculate s hash functions to store element x – Easy to check membership – Reduce memory at the expense of false positives x Vm-1 V0 0 0 0 1 0 0 0 1 0 1 h1(x) h2(x) h3(x) 0 1 0 0 0 hs(x) BUFFALO: Bloom Filter Forwarding • One Bloom filter (BF) per next hop – Store all addresses forwarded to that next hop Bloom Filters query Packet destination Nexthop 1 Nexthop 2 hit …… Nexthop T 8 BUFFALO Challenges How to optimize memory usage? • Minimize the false-positive rate How to handle false positives quickly? • No memory/payload overhead How to handle routing dynamics? • Make it easy and fast to adapt Bloom filters 9 BUFFALO Challenges How to optimize memory usage? • Minimize the false-positive rate How to handle false positives quickly? • No memory/payload overhead How to handle routing dynamics? • Make it easy and fast to adapt Bloom filters 10 Optimize Memory Usage • Consider fixed forwarding table • Goal: Minimize overall false-positive rate – Probability one or more BFs have a false positive • Input: – Fast memory size M – Number of destinations per next hop – The maximum number of hash functions • Output: the size of each Bloom filter – Larger BF for next-hops with more destinations 11 Constraints and Solution • Constraints – Memory constraint • Sum of all BF sizes fast memory size M – Bound on number of hash functions • To bound CPU calculation time • Bloom filters share the same hash functions • Proved to be a convex optimization problem – An optimal solution exists – Solved by IPOPT (Interior Point OPTimizer) 12 Minimize False Positives Overall False Positive Rate – Forwarding table with 200K entries, 10 next hop – 8 hash functions – Optimization runs in about 50 msec 100.0000% 10.0000% 1.0000% 0.1000% 0.0100% 0.0010% 0.0001% 0 200 400 600 800 Memory Size (KB) 1000 13 Comparing with Hash Table • Save 65% memory with 0.1% false positives Fast Memory Size (MB) • 14 More benefitshash over hash table table 12 fp=0.01% – Performance degrades gracefully as tables grow 10 fp=0.1% 65% – Handle worst-case workloads well 8 fp=1% 6 4 2 0 0 500 1000 1500 2000 # Forwarding Table Entries (K) 14 BUFFALO Challenges How to optimize memory usage? • Minimize the false-positive rate How to handle false positives quickly? • No memory/payload overhead How to handle routing dynamics? • Make it easy and fast to adapt Bloom filters 15 False Positive Detection • Multiple matches in the Bloom filters – One of the matches is correct – The others are caused by false positives Bloom Filters query Packet destination Multiple hits Nexthop 1 Nexthop 2 …… Nexthop T 16 Handle False Positives • Design goals – Should not modify the packet – Never go to slow memory – Ensure timely packet delivery • BUFFALO solution – Exclude incoming interface • Avoid loops in “one false positive” case – Random selection from matching next hops • Guarantee reachability with multiple false positives 17 One False Positive • Most common case: one false positive – When there are multiple matching next hops – Avoid sending to incoming interface • Provably at most a two-hop loop – Stretch <= Latency(AB)+Latency(BA) A dst B 18 Multiple False Positives • Handle multiple false positives – Random selection from matching next hops – Random walk on shortest path tree plus a few false positive links – To eventually find out a way to the destination False positive link Shortest path tree for dst dst 19 Stretch Bound • Provable expected stretch bound – With k false positives, proved to be at most – Proved by random walk theories (3k / 3 ) • However, stretch bound is actually not bad – False positives are independent – Probability of k false positives drops exponentially • Tighter bounds in special topologies – For tree, expected stretch is polynomial in k 20 Stretch in Campus Network 1.E+00 fp=0.5% When fp=0.001% fp=0.1% When 99.9% of the packets havefp=0.5%, 0.0002% fp=0.001% no stretch packets have a stretch 6 0.0003% packets a times ofhave shortest path stretch of shortestlength path length 1.E-01 CCDF 1.E-02 1.E-03 1.E-04 1.E-05 1.E-06 0 1 2 3 4 5 Stretch normalized by shortest path length 6 7 21 BUFFALO Challenges How to optimize memory usage? • Minimize the false-positive rate How to handle false positives quickly? • No memory/payload overhead How to handle routing dynamics? • Make it easy and fast to adapt Bloom filters 22 Problem of Bloom Filters • Routing changes – Add/delete entries in BFs • Problem of Bloom Filters (BF) – Do not allow deleting an element • Counting Bloom Filters (CBF) – Use a counter instead of a bit in the array – CBFs can handle adding/deleting elements – But, require more memory than BFs 23 Update on Routing Change • Use CBF in slow memory – Assist BF to handle forwarding-table updates – Easy to add/delete a forwarding-table entry CBF in slow memory 2 0 1 0 0 0 Delete a route 3 0 2 0 0 0 0 0 2 0 1 0 1 0 0 0 0 0 1 0 BF in fast memory 1 0 1 0 0 0 24 Occasionally Resize BF • Under significant routing changes – # of addresses in BFs changes significantly – Re-optimize BF sizes • Use CBF to assist resizing BF – Large CBF and small BF – Easy to expand BF size by contracting CBF CBF 0 0 1 1 0 0 0 3 0 Easy to contract CBF to size 4 BF 0 Hard to expand to size 4 0 0 1 0 25 BUFFALO Switch Architecture – Prototype implemented in kernel-level Click 26 Prototype Evaluation • Environment – 3.0 GHz 64-bit Intel Xeon – 2 MB L2 data cache, used as fast memory size M • Forwarding table – 10 next hops – 200K entries 27 Prototype Evaluation • Peak forwarding rate – 365 Kpps, 1.9 μs per packet – 10% faster than hash-based EtherSwitch • Performance with FIB updates – 10.7 μs to update a route – 0.47 s to reconstruct BFs based on CBFs • On another core without disrupting packet lookup • Swapping in new BFs is fast 28 Conclusion • BUFFALO – Improves data plane scalability in SPAF networks – To appear in CoNEXT 2009 • Three properties: – Small, bounded memory requirement – Gracefully increase stretch with the growth of forwarding table – Fast reaction to routing updates 29 Ongoing work • More theoretical analysis of stretch – Stretch in weighted graph – Stretch in ECMP (Equal Cost Multi Path) – Analysis on average stretch • Efficient access control in SPAF network – More complicated than forwarding table – Scaling access control using OpenFlow – Leverage properties in SPAF network 30 • Thanks! • Questions? 31