A Study on d-left Counting Bloom Filter for Dynamic Packets Filtering Peizhen Lin, Feng Wang*, Weiliang Tan, Hui Deng Yunnan Computer Technology Application Key Lab Kunming University of Science and Technology Kunming, China, 650051 lpz@cnlab.net, wf@cnlab.net, twl@cnlab.net, dh@kmust.edu.cn Abstract In the previous study of dynamic packet filtering technique, list-base Counting Bloom Filter (CBF) algorithm was adopted to implement the filtering rule’s adding and deleting dynamically. However, the drawbacks of list-base CBF, i.e., low memory utilization, limited rule capacity and high false positive rate (FP rate), are obvious. Time efficiency also needs improvement. In this paper, d-left Counting Bloom Filter (d-left CBF) algorithm is exploited to improve the performance of the dynamic packet filtering. Through the usage of the algorithm, compared with list-based CBF, approximately 55 times memory space can be saved at least. In addition, with the increasing of memory allocation, the false positive rate of d-left CBF decreases more significantly than list-base CBF. Improved d-left CBF can enhance time efficiency from16.61% to 39.55% compare with list-based CBF when querying data which matches the filter rules with existed rules quantity from 1000 to 40000 and enhance time efficiency from 0.91% to 18.12% compare with list-based CBF when querying data which doesn’t match the filter rules with existed rules quantity from 1000 to 35000. The experimental results and corresponding analyses indicate that d-left CBF is feasible and high-efficient in the process of the dynamic packet filtering. Key Words: dynamic packet filter; d-left Counting Bloom Filter; false positive rate 1. Introduction Network stream packets capturing technique has been widely used in network applications, i.e., protocol analysis, firewall and intrusion detection system. With the advent of high-speed (more than 1Gbits) network, the capturing performance cannot meet the requirements of real time data processing because of the high CPU utilization and high packet-loss. Aimed at such problems, many prior literatures [1-7] discussed the hardware-based and software-based approaches in recent years. Modern applications such as VoIP (Voice over IP) and P2P traffic monitoring require dynamic packet filtering based on simple VLAN/IP address/port number criteria. Popular packet filtering facilities such as BPF (Berkeley Packet Filter) [6] and router-based ASIC (Application Specific Integrated Circuits) filtering are not competent for these applications. According to specific purposes, packets that each application gets should be part of the network stream. It should be bound to spend unnecessary CPU processing time and inevitably affect the whole system’s performance if the non-essential data packets went through the kernel. Luca Deri [7] proposed a pure software approach that adopted list-base CBF [3, 7 and 8] to implement dynamic packet filtering. Although list-base CBF resolved the problems of adding and deleting elements and is superior to the traditional static filtering mechanism, the drawbacks of list-base CBF, i.e., low memory utilization, limited rule capacity and high FP rate, limited time efficiency, still need to be further improved. The d-left CBF algorithm is devised by Flavio Bonomi [9] and is a pure software approach. The aim of this study is to implement it to the field of dynamic packet filtering. Compare with list-base CBF, d-left CBF can better increase memory utilization and decrease FP rate dramatically, improved d-left CBF can better increase time efficiency. 2. Related Work 2.1 Bloom Filters and Counting Bloom Filters A Bloom filter [10] uses an array to determine whether an element belongs to a given set. When adding, each hashing function is applied to the key x, and gets k locations P1,P2,… ,Pk. Then all the bits that correspond to these locations are set to one. Checking whether an element belongs to the set is simple. The k hashing functions are applied to the element at first. If all the bits that correspond to the result of the hashing functions are set to one, then the element belongs to the set. Bloom filters may make wrong judgment because the element may not belong to this given set. We call this “false positive”. Bloom Filters can be competent to add elements but can’t be competent to delete elements from a set. Counting Bloom filters [7] can solve this problem well. They make each bit a counter. The counter is added one when adding an element and subtracted one when removing an element. 2.2 List-base CBF Even if CBF seems to have several excellent features and few limitations, it is rather costly if it’s implemented in software. First of all, to calculate k hashing functions can be faster in parallel computing hardware, but in software it is different. If the value of k is large, the efficiency of filtration will decline. Furthermore, in order to add and delete elements, we have to use CBF. Unfortunately, CBF will take up too much memory. Fig. 1. List-base CBF. Luca solutes it as the following: Firstly, let k=2. The worst case is that when checking whether a packet should be dropped, only two hashing functions would be applied. Secondly, it is necessary to use CBF, but we should avoid the drawbacks of CBF. Luca establish a hashing list as shown in Fig. 1. It can not only remove hashing conflicts, but also resolve the problems of adding, deleting and large memory allocation. 2.3 d-left CBF d-left CBF derives from d-left hashing [11]. The d-left CBF makes the number of hashing functions d. The entire hashing table is divided into d adjacent sub-tables from left to right. Assuming that there are n buckets totally, so each sub-table contains n/d buckets, and each bucket contains m cells to save fingerprints. Each cell owns a counter in case that there exist two identical fingerprints When adding a new element, the element is computed by each hashing function. We get d positions, and add it into the bucket whose load is the lightest. If the number of this kind of bucket is more than one, the element would be added to the leftmost bucket. Similarly, when querying, it needs to search d different positions. When deleting, it’s supposed to search hash value from the d corresponding locations. If the counter isn’t zero, one is subtracted from the counter, otherwise the fingerprint is deleted. 3. A Comparison between List-based CBF and d-left CBF 3.1 False Positive Rate and Memory Utilization Comparison Equation (3.1) is for calculating the FP rate of list-base CBF. In this formula, m is the length of array; k is the number of hashing function (k=2, in Luca’s method); n is the total of the set. In d-left CBF, the number of sub-tables is 4 (d=4). The average load of each bucket is 6, r bits are for the fingerprint and 2 bits are for the counter. So the FP rate is 24×2-r. If there are n elements waiting to be inserted, m = 4/3×n×(r+2) [9]. Based on the above analyses, the FP rate of list-base CBF only depends on the value of m and n , but has nothing to do with r. The FP rate of d-left CBF depends on the value of r. k k 1 kn P 1 1 1 e kn / m m (3.1) Fig. 2 shows the FP rates of list-base CBF and d-left CBF under the condition of allocating a same memory space. In the beginning, the FP rate of list-base CBF is lower than the d-left CBF. However, the curve of the list-base CBF is much smoother than the d-left CBF’s. When the memory allocation is more than 2.16KB, the FP rate of list-base CBF is higher than the dleft CBF. The FP rate of d-left CBF decreases dramatically with the increasing of the memory allocation and it is close to zero when the memory allocation obtains 3.4 KB. Fig. 2. The contrast of false positive rate. Supposing m, p are constant, we can conclude n (i.e. n1) from formula 3.1 for list-base CBF. n1 m ln(1 1 0 2 log10 ( p ) ) (3.2) The n of d-left CBF is n2: n2 3m 4(2 log 2 ( p )) 24 (3.3) The ratio of n2/n1 is: n2 n1 3 2 ln (1 10 p log10 ( ) 2 )(2 log 2 ( p )) 24 (3.4) Fig. 3. The rule capacity ratio of d-left CBF to list-base CBF. d-left CBF’s rule capacity is approximately 55 times bigger than list-base CBF under the condition of the same memory allocation (when p=0.000001, m is constant, so n2/n1 =56). Fig. 3 indicates that the lower the FP rate, the smaller n2/n1 is. It means that d-left CBF can have more rules when FP rate and memory allocation is constant. This advantage should be more and more obvious with the decreasing of the FP rate. These analyses also conclude that d-left CBF is more suitable to be used in dynamic packet filtering. 5.2 Time Efficiency Comparison If these two algorithms just care about the hashing function’s execution time, d-left CBF is faster than list-based CBF because d-left CBF just need to do hashing function one time but list-base CBF requires twice. The fact is that if filters query data which matches the filter rules, d-left CBF is faster and if filters query data which doesn’t match the filter rules, listbase CBF is faster. The reason is that if the data matches the filter rules, list-based CBF always need twice hashing but once for d-left CBF. While querying data which doesn’t match the filter rules, the querying function of list-based CBF would return directly once the first hashing function result makes a failed match. It is too different for d-left CBF because it always do one time hashing and it need to check four sub-tables to see if there is an identical 0.012 0.01 0.008 0.006 0.004 0.002 0 1000 5000 7500 10000 12500 15000 17500 20000 22500 25000 27500 30000 32500 35000 37500 40000 time(s) fingerprint, and this operation is time consumed. rule quantity Query time of list-based CBF Query time of d-left CBF Fig. 4. Query time contrast of list-based CBF and d-left CBF (49152 items match the filter rules) time(s) 0.02 0.015 0.01 0.005 1000 5000 7500 10000 12500 15000 17500 20000 22500 25000 27500 30000 32500 35000 37500 40000 0 rule quantity Query time of list-based CBF Query time of d-left CBF Fig. 5. Query time contrast of list-based CBF and d-left CBF (49152 items don’t match the filter rules) In order to get a correct experiment result, we make sure the two algorithms have a max rule capacity of 65536. According to theoretical analyses, the data should be divided into two parts. One part has 49152 items which all match the filter rules and another part has 49152 items which don’t match the filter rules. The experiment results conclude that if data matches the filter rules, d-left CBF have an obvious advantage about querying execution time. This advantage keeps in the rule range from 1 to some point between 30000 and 32500(as shown in Fig. 4). With the rules increase, the match times should increase correspondingly. If data doesn’t match the filter rules, listbase CBF is always faster than d-left CBF (as shown in Fig. 5). 4. Improvements of d-left CBF 4.1 Decreasing the Hashing Time Hashing function execution time takes up too much time in the whole searching process. It is known to all that multiplication is slower than summation if they were operated in CPU. dleft CBF do the hashing with a lot of multiplications in the d-left CBF source code. Here we would make the whole hashing process a summation process instead of multiplications. The experiment makes 500 thousand time hashing to compare list-base CBF and d-left CBF which using new hashing function. With the new hashing function which uses summations instead of multiplications, the time consumption of d-left CBF is about 0.00297971 second while time consumption of d-left CBF which uses multiplications for hashing function is about 0.004067108. It almost saves 26.7% time consumption. 4.2 Decreasing the Match Times Fingerprint match times also takes up a lot of time in the whole searching process. In Flavio’s paper d-left CBF is proposed to solving the problem that the counters of CBF take up too much space and d-left CBF also sought the load balance issue. According to the description of d-left CBF, d-left CBF adopted four sub-tables model to make sure the load of whole table is stable and suitable for querying. If d=1, that is no d-left CBF because it would ruin the whole thinking of source algorithm. If d=8, there are too many sub-tables. Although the load balance is better under these circumstances, querying process should spend a lot of time to match fingerprint, i.e., height=9, it needs to do eight times additional-linear random permutations, what is worse, it needs to do seventy-two times match operations in the worst situation. In Flavio’s paper, d-left CBF make d=4. In dynamic packets filtering application, time efficiency is the most important factor. In this paper, we adopt new sub-table model which uses two sub-tables to decrease the match times. The experiment makes a time efficiency contrast of two kinds of table structures, 4096×2×8 and 2048×4×8. In this experiment, each bucket has eight cells, 4096 and 2048 stand for the total number of buckets that each sub-table contains. Bucket number should be the high 12bits of hashing value if it’s two sub-table model and be the high 11bits of hashing value if it’s four sub-table model. Fingerprint should be the low 14bits of hashing value anytime. Fig. 6. Inserting, removing and querying time contrast of two table models. Rules inserting and removing operations all do 49152 times. But querying do 98304 times, half times for items which match the filter rules, half times for items which don’t match the filter rules. With the two sub-table structure, the rules inserting time consumption is about 0.01379124s while the rules inserting time consumption which uses four sub-table model is about 0.01973846s. For removing time consumption, two sub-table model is 0.01346165s while four sub-table model has 0.01678702s. For querying time consumption, two sub-table model is 0.02543709s while four sub-table model has 0.03648067s. The experiment result conclude that two sub-table model spend less time than four sub-table model as shown in Fig. 6. As shown in Fig. 7, we also have a load balance test of these two table models. We have five different random data sets and each data set has 49152 items. Insert each data set into these two kinds of table structure respectively to see the load balance. The experiment result shows that four sub-table model is more balanced than two sub-table model. For two subtable model, there are 524 buckets loads 8. It means its average value is 105 and it takes an overflow rate of 1.3%. However, there is no need to worry about overflow problem because rule quantity is limited in dynamic packets filtering and 5000 rules should be a large number. Fig. 7. The load contrast test of the two kinds of table models. 4.3 A Querying Time Efficiency Comparison between List-based CBF and Improved d-left CBF After improving d-left CBF, we have a contrast between list-base CBF and improved dleft CBF. The experiment queries 49152 times to see the time efficiency difference of these two algorithms. In order to get a correct experiment result, we make sure the two algorithms both have a max rule capacity of 65536. The data set which is waiting to be checked is divided into two parts. One part has 49152 items which all match the filter rules and another part has 49152 items which don’t match the filter rules. The experiment result shows that when rule quantity is before 35000, improved d-left CBF is faster than list-based CBF when querying the data which match the filter rules. When querying the data which doesn’t match the rules, improved d-left CBF always runs faster than 35000 32500 30000 27500 25000 22500 20000 17500 15000 12500 7500 10000 5000 45.00% 40.00% 35.00% 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% 1000 percentage of time's improvement list-base CBF. Fig. 8 shows the detailed improvement by percentage chart. rule quantity Querying data which match the filter rules Querying data which doesn't match the filter rules Fig. 8. Percentage which improved d-left CBF increases compare with list-based CBF (Querying 49152 items). This chart indicates that improved d-left CBF can enhance time efficiency from16.61% to 39.55% compare with list-based CBF when querying data which matches the filter rules with existed rules quantity from 1000 to 40000 and enhance time efficiency from 0.91% to 18.12% compare with list-based CBF when querying data which doesn’t match the filter rules with existed rules quantity from 1000 to 35000. 5. Conclusions In this paper, we adopt the d-left CBF algorithm in dynamic packet filtering. The experimental results and analyses show that d-left CBF is feasible and high-efficient in the process of the dynamic packet filtering. It saves approximately 56 times memory allocation than CBF at least. Compare with CBF, it also decrease the FP rate more significantly. In order to improve time efficiency of d-left CBF, this paper decrease the hashing time with the new hashing function which uses summations instead of multiplications and decrease the match times with two sub-tables model. References [1] Begel, A., McCanne, S. and Graham, S.L., BPF+: exploiting global data-flow optimization in a generalized packet filter architecture. SIGCOMM Comput. Commun. Rev. , 29 (1999), 123-134. [2] Cho, Y.H., and Mangione-Smith, W.H., Deep network packet filter design for reconfigurable devices. ACM Trans. Embed. Comput. Syst. , 7 (2008), 1-26. [3] Dharmapurikar, S., Krishnamurthy, P., Sproull, T.S., and Lockwood, J.W., Deep Packet Inspection using Parallel Bloom Filters. IEEE Micro , 24(2004), 52-61. [4] Engler, D.R. and Kaashoek, M.F., DPF: fast, flexible message demultiplexing using dynamic code generation. SIGCOMM Comput. Commun. Rev. , 26 (1996), 53-59. [5] Bos, H., de Bruijn, Cristea, W., M., Nguyen, T. and Portokalidis, G., FFPF: fairly fast packet filters. Proc. Proceedings of the 6th conference on Symposium on Opearting Systems Design \& Implementation, USENIX Association, 6(2004), 24. [6] McCanne, S. and Jacobson, V., The BSD packet filter: a new architecture for user-level packet capture. Proc. Proceedings of the USENIX Winter 1993 Conference Proceedings on USENIX Winter 1993 Conference Proceedings , USENIX Association, 1993, 2. [7] Deri, L., High-Speed Dynamic Packet Filtering. J. Netw. Syst. Manage. , 15 (2007), 401415. [8] Fan, L., Cao, P., Almeida, J. and Broder, A.Z., Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Netw. , 8 (2000), 281-293. [9] Bonomi, F., Mitzenmacher, M., Panigrahy, R., Singh, S. and Varghese, G., An improved construction for counting bloom filters. Proc. Proceedings of the 14th conference on Annual European Symposium, Springer-Verlag, 14(2006), 684-695. [10] Bloom, B.H., Space/time trade-offs in hash coding with allowable errors. Communications of the ACM , 13(1970), 422-426. [11] Vöcking, B., How asymmetry helps load balancing. J. ACM , 50 (2003), 568-589. *Corresponding author: Feng Wang Yunnan Computer Technology Application Key Lab Kunming University of Science and Technology, Kunming, Yunnan, China, 650051 E-mail: wf@kmust.edu.cn