Advanced Hash Algorithms with Key Bits Duplication for IP Address Lookup Author: Christopher Martinez and Wei-Ming Lin Publisher: 2009 Fifth International Conference on Networking and Services Presenter: Han-Chen Chen Date:2009/10/7 1 Outline Introduction Some of Non-Duplication XOR Folding Algorithms Bit-Duplication XOR Hashing approaches Minimal IDC Duplication Performance 2 Introduction Preprocess & Improve XOR Hashing Algorithms Reduce MSL (Maximum Search Length) & ASL (Average Search Length) data 2m entry … Hash function … Data data … collision … collision data m bits Bucket … collision 3 MSL & ASL Data Hash function 0 1 2 3 4 5 6 7 Bucket 2 1 3 MSL=3 1 1 ASL=(2+1+3+1+1)/6 1 4 XOR Hashing (Group XOR) Random XORing process (m=3) DB Entry #0 #1 #2 #3 #4 #5 #6 #7 Bit Position 76543210 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 1 1 1 1 1 1 1 0 1 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 0 0 1 1 1 1 1 1 0 1 1 0 0 0 0 0 1 0 1 0 1 1 1 0 1 A Bit Position B C 7 6 5 4 3 2 1 0 ⊕ ⊕ ⊕ A B C 5 Non-Duplication XOR Hashing (In-order XOR Hashing) DB Entry Bit Position #7 7 1 0 0 0 0 1 0 0 d= 42644422 #0 #1 #2 #3 #4 #5 #6 DB Entry #0 #1 #2 #3 #4 #5 #6 #7 d= 6 0 0 1 0 0 0 1 1 6 0 0 1 0 0 0 1 1 5 1 1 1 1 1 1 0 1 4 0 0 0 0 0 1 1 0 3 1 0 0 0 0 0 1 0 2 0 1 1 1 1 1 1 0 1 1 1 0 0 0 0 0 1 0 0 1 0 1 1 1 0 1 A 0 0 1 0 1 1 1 0 1 7 1 0 0 0 0 1 0 0 4 0 0 0 0 0 1 1 0 3 1 0 0 0 0 0 1 0 2 0 1 1 1 1 1 1 0 5 1 1 1 1 1 1 0 1 22244446 C Bit Position 6 1 0 7 4 3 2 5 d value 2 2 2 4 4 4 4 6 Sort d value Bit Position 1 1 1 0 0 0 0 0 1 B ⊕ ⊕ ⊕ A B C 6 Three Simple Bit-Duplication XOR Hashing Approaches Self-Duplication Exchange-Duplication Cycle-Duplication 7 Self-Duplication Deficiency: Nullification Bucket size : m=4 A B C D A B C D W X Y Z W X Y Z d value small A A A A big A⊕A=0 8 Exchange-Duplication Deficiency: Downgrade Bucket size : m=4 A B C D A B C D W X Y Z W X Y Z d value small B A A A big A⊕B=B⊕A 9 Cycle-Duplication Bucket size : m=4 A B C D A B C D W X Y Z W X Y Z d value small D A B C big 10 MSL & ASL on Randomly Generated Data Sets length length m Maximum Search Length m Average Search Length The self-duplication reduction of 15% both in MSL and ASL. The cycle-duplication reduction of 50% in MSL and 27% in ASL. 11 Minimal IDC Duplication (1/4) IDC : Induced Duplication Correlation bit0 Bucket size : m=4 bit1 bit2 bit3 bit0 bit1 bit2 bit3 bit0 bit1 bit2 bit3 A B C D A B C D A B C D D A B C D A B C B C D A C D A B B C D A D A B C Duplication times : X=2 Given m, how many times can it be duplicated without causing the downgrading problem? or In order to duplicate X times without the downgrading problem, what is the minimal m required? 12 Minimal IDC Duplication (2/4) bit0 m=7 bit1 bit2 bit3 bit4 bit5 bit6 A B C D E F G A G A B C D E F X3 E F G A B C D X5 X=2 X1 … A X6 X2 … X4 A Proof : m ≥ (X + 1)2 – (X + 1) = X2 + X +1 m ≥ X * (X + 1) + 1 13 Minimal IDC Duplication (3/4) bit0 bit1 bit2 bit3 bit4 bit5 bit6 A B C D E F G m=7 G A B C D E F X=2 F G A B C D E Dij = min( (si − sj) mod m , (sj − si) mod m ) Dij = Dkl , ∀i, j, k, l, 0 ≤ i, j, k, l ≤ m, and (i, j) ≠(k, l). 14 Minimal IDC Duplication (4/4) m=13, X=3 13 ≥ 3 * (3 + 1) + 1 ok! D01=Min((0-1)mod13, (1-0)mod13)=1 D13=Min((0-2)mod13, (2-0)mod13)=2 D03=Min((0-3)mod13, (3-0)mod13)=3 D09=Min((0-9)mod13, (9-0)mod13)=4 D19=Min((1-9)mod13, (9-1)mod13)=5 D39=Min((3-9)mod13, (9-3)mod13)=6 15 Performance length length m m MSL & ASL on Randomly Generated Data Sets 16 Thanks for your listening 17