Authors : Baohua Yang, Jeffrey Fong, Weirong Jiang, Yibo Xue, and Jun Li. Publisher : IEEE TRANSACTIONS ON COMPUTERS - 2012 Presenter : Chai-Yi Chu Date : 2012/11/14 1 Introduction Related Work Algorithm Evaluation 2 Dynamic Discrete Bit Selection ◦ 𝐷 2 𝐵𝑆 ◦ Packet classification algorithm 64-byte Ethernet packet and 10K ACL ruleset ◦ 10 Gbps on Cavium OCTEON CN5860 multi-core network processor ◦ 135 Gbps on Xilinx Virtex-5 FPGA 3 Two major types of packet classification algorithm ◦ Searching space partition partition the searching space into smaller subspaces Ex. RFC, HSM 4 ◦ Ruleset partition cut the large ruleset into smaller ones Ex. HiCuts, HyperCuts 5 Utilize dynamic heuristics to split the ruleset efficiently Combine different data structures to optimize both time and space performance 6 Terminology ◦ E-Bits some bits will partition the ruleset more “effectively”. To partition ruleset into smaller sub-rulesets. ◦ M-Vector Definition 3.1 (M-Vector): A M-Vector V is a bit vector that satisfies: V[i] = 1 only if bit i is an E-Bit, otherwise V[i] = 0. 0 1 0 1 7 D-Table ◦ D-Table S-Block Dynamic Indexing Table T. For n bits, T consists of 2𝑛 cells, where each cell stores a pointer respectively. ◦ S-Block memory blocks that are pointed by T’s cell. Each S-Block stores a subset of the ruleset R. Rules are stored orderly from high to low by priority. 8 Preparation Phase ◦ 1. E-Bits selection ◦ 2. M-Vector generation ◦ 3. D-Table construction 9 1. E-Bits selection ◦ Two problems E-Bits choosing Length optimization ◦ Solutions Judging Function(J-Function) Performance Function(P-Function) 10 J-Function ◦ To judge which bits are E-Bits minimizing the maximum size of all subsets 𝑒 = E-Bits set 𝑁𝑢𝑚𝑅𝑢𝑙𝑒 𝑅𝑖 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑢𝑙𝑒𝑠 𝑖𝑛 𝑅𝑖 minimizing the size of sub searching space 𝑁𝑢𝑚𝑆𝑢𝑏𝑠𝑝𝑎𝑐𝑒 𝑅𝑖 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑏𝑠𝑝𝑎𝑐𝑒𝑠 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑒𝑑 𝑏𝑦 𝑅𝑖 11 P-Function ◦ To decide the number of E-Bits. 𝑖. 𝑒. , to optimize the length of 𝑛. 𝑀𝑒𝑚 𝑉 = 𝑚𝑒𝑚𝑜𝑟𝑦 𝑢𝑠𝑒𝑑 𝑏𝑦 M-Vector 12 2. M-Vector generation ◦ Built with the process of E-Bits choosing and length optimization Fast-Growth Intelli-Swap(I-Swap) 13 3. D-Table construction ◦ Set up T with length 2𝑛 , each cell T[i] stores a pointer to an empty S-Block. ◦ Use V to mask r at the E-Bits positions, which results in a bit-string i of length n. ◦ Insert r into the bottom of the S-Block pointed by T[i]. 14 15 Classification Phase H V T 16 Complexity Analysis ◦ ruleset R consists of N rules, and n E-Bits are selected. 2 3 ◦ Time complexity is 𝑂(( )𝑛 ∗ 𝑁) ◦ Storage complexity is 4 𝑛 𝑂(( ) 3 ∗ 𝑁) 17 Suppose the length of the D-Table is 2𝑛 . The length of the largest S-Blocks is L. ◦ Time complexity is 𝑂(𝐿), ◦ Storage complexity is 𝑂(2𝑛 ∗ 𝐿). The size ratio between the subsets and the original ruleset is ρ. Then the size of the partitioned subsets will be ρ · N. Thus with n E-Bits, ◦ 𝑂(𝐿) = 𝑂(ρ𝑛 · 𝑁). 18 IA based Implementation 19 ◦ Memory storage usage 20 ◦ Scalability The ratio of Memory Per Rule (MPR) 21 Multi-core NP Implementation ◦ Cavium OCTEON CN5860, ACL10K 22 FPGA based Implementation ◦ Xilinx Virte-5 FPGA(XC5VSX240T), 2048 Kb of Block RAM 23 24