CoNEXT’09 BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University minlanyu@cs.princeton.edu Joint work with Alex Fabrikant, Jennifer Rexford 1 Large-scale SPAF Networks • Large network using flat addresses – Simple for configuration and management – Useful for enterprise and data center networks • SPAF:Shortest Path on Addresses that are Flat – Flat addresses (e.g., MAC addresses) – Shortest path routing (link state, distance vector) H S S H H H S S S S S H S H S H S S S S H H 2 Scalability Challenges • Recent advances in control plane – E.g., TRILL, SEATTLE – Topology and host information dissemination – Route computation • Data plane remains a challenge – Forwarding table growth (in # of hosts and switches) – Increasing link speed 3 State of the Art • Hash table in SRAM to store forwarding table – Map MAC addresses to next hop – Hash collisions: extra delay 00:11:22:33:44:55 00:11:22:33:44:66 …… aa:11:22:33:44:77 • Overprovision to avoid running out of memory – Perform poorly when out of memory – Difficult and expensive to upgrade memory 4 Bloom Filters • Bloom filters in fast memory (SRAM) – A compact data structure for a set of elements – Calculate s hash functions to store element x – Easy to check membership – Reduce memory at the expense of false positives x Vm-1 V0 0 0 0 1 0 0 0 1 0 1 h1(x) h2(x) h3(x) 0 1 0 0 0 hs(x) BUFFALO: Bloom Filter Forwarding • One Bloom filter (BF) per next hop – Store all addresses forwarded to that next hop Bloom Filters query Packet destination Nexthop 1 Nexthop 2 hit …… Nexthop T 6 BUFFALO Challenges How to optimize memory usage? • Minimize the false-positive rate How to handle false positives quickly? • No memory/payload overhead How to handle routing dynamics? • Make it easy and fast to adapt Bloom filters 7 BUFFALO Challenges How to optimize memory usage? • Minimize the false-positive rate How to handle false positives quickly? • No memory/payload overhead How to handle routing dynamics? • Make it easy and fast to adapt Bloom filters 8 Optimize Memory Usage • Consider fixed forwarding table • Goal: Minimize overall false-positive rate – Probability one or more BFs have a false positive • Input: – Fast memory size M – Number of destinations per next hop – The maximum number of hash functions • Output: the size of each Bloom filter – Larger BF for next-hops with more destinations 9 Constraints and Solution • Constraints – Memory constraint • Sum of all BF sizes fast memory size M – Bound on number of hash functions • To bound CPU calculation time • Bloom filters share the same hash functions • Proved to be a convex optimization problem – An optimal solution exists – Solved by IPOPT (Interior Point OPTimizer) 10 Minimize False Positives Overall False Positive Rate – Forwarding table with 200K entries, 10 next hop – 8 hash functions – Optimization runs in about 50 msec 100.0000% 10.0000% 1.0000% 0.1000% 0.0100% 0.0010% 0.0001% 0 200 400 600 800 Memory Size (KB) 1000 11 Comparing with Hash Table • Save 65% memory with 0.1% false positives Fast Memory Size (MB) • 14 More benefitshash over hash table table 12 fp=0.01% – Performance degrades gracefully as tables grow 10 fp=0.1% 65% – Handle worst-case workloads well 8 fp=1% 6 4 2 0 0 500 1000 1500 2000 # Forwarding Table Entries (K) 12 BUFFALO Challenges How to optimize memory usage? • Minimize the false-positive rate How to handle false positives quickly? • No memory/payload overhead How to handle routing dynamics? • Make it easy and fast to adapt Bloom filters 13 False Positive Detection • Multiple matches in the Bloom filters – One of the matches is correct – The others are caused by false positives Bloom Filters query Packet destination Multiple hits Nexthop 1 Nexthop 2 …… Nexthop T 14 Handle False Positives • Design goals – Should not modify the packet – Never go to slow memory – Ensure timely packet delivery • BUFFALO solutions – Exclude incoming interface • Avoid loops in “one false positive” case – Random selection from matching next hops • Guarantee reachability with multiple false positives 15 One False Positive • Most common case: one false positive – When there are multiple matching next hops – Avoid sending to incoming interface • Provably at most a two-hop loop – Stretch <= Latency(AB)+Latency(BA) A dst B 16 Multiple False Positives • Handle multiple false positives – Random selection from matching next hops – Random walk on shortest path tree plus a few false positive links – To eventually find out a way to the destination False positive link Shortest path tree for dst dst 17 Stretch Bound • Provable expected stretch bound – With k false positives, proved to be at most – Proved by random walk theories (3k / 3 ) • However, stretch bound is actually not bad – False positives are independent – Probability of k false positives drops exponentially • Tighter bounds in special topologies – For tree, expected stretch is polynomial in k 18 Stretch in Campus Network 1.E+00 fp=0.5% When fp=0.001% fp=0.1% When 99.9% of the packets havefp=0.5%, 0.0002% fp=0.001% no stretch packets have a stretch 6 0.0003% packets a times ofhave shortest path stretch of shortestlength path length 1.E-01 CCDF 1.E-02 1.E-03 1.E-04 1.E-05 1.E-06 0 1 2 3 4 5 Stretch normalized by shortest path length 6 7 19 BUFFALO Challenges How to optimize memory usage? • Minimize the false-positive rate How to handle false positives quickly? • No memory/payload overhead How to handle routing dynamics? • Make it easy and fast to adapt Bloom filters 20 Problem of Bloom Filters • Routing changes – Add/delete entries in BFs • Problem of Bloom Filters (BF) – Do not allow deleting an element • Counting Bloom Filters (CBF) – Use a counter instead of a bit in the array – CBFs can handle adding/deleting elements – But, require more memory than BFs 21 Update on Routing Change • Use CBF in slow memory – Assist BF to handle forwarding-table updates – Easy to add/delete a forwarding-table entry CBF in slow memory 2 0 1 0 0 0 Delete a route 3 0 2 0 0 0 0 0 2 0 1 0 1 0 0 0 0 0 1 0 BF in fast memory 1 0 1 0 0 0 22 Occasionally Resize BF • Under significant routing changes – # of addresses in BFs changes significantly – Re-optimize BF sizes • Use CBF to assist resizing BF – Large CBF and small BF – Easy to expand BF size by contracting CBF CBF 0 0 1 1 0 0 0 3 0 Easy to contract CBF to size 4 BF 0 Hard to expand to size 4 0 0 1 0 23 BUFFALO Switch Architecture – Prototype implemented in kernel-level Click 24 Prototype Evaluation • Environment – 3.0 GHz 64-bit Intel Xeon – 2 MB L2 data cache, used as fast memory size M • Forwarding table – 10 next hops – 200K entries 25 Evaluation Results • Peak forwarding rate – 365 Kpps, 1.9 μs per packet – 10% faster than hash-based EtherSwitch • Performance with FIB updates – 10.7 μs to update a route – 0.47 s to reconstruct BFs based on CBFs • On another core without disrupting packet lookup • Swapping in new BFs is fast 26 Conclusion • Three properties of BUFFALO – Small, bounded memory requirement – Gracefully increase stretch with the growth of forwarding table – Fast reaction to routing updates • Key design decisions – One Bloom filter per next hop – Optimizing of Bloom filter sizes – Preventing forwarding loops – Dynamic updates using counting Bloom filters 27 • Thanks! • Questions? 28