Variations of Forward-SBNDM Hannu Peltola Jorma Tarhio Aalto University Finland Aims Tuning algorithms for exact string matching. Studying the effect of simultaneous 2-byte read. Aug. 29, 2011 SBNDM Simple Backward Nondeterministic DAWG Matching SBNDM [18] is a simplification of BNDM [17]. Both are bit-parallel algorithms. Text T = t1...tn, pattern P = p1...pm. At each alignment window of P in T, scan T from right to left until the suffix of the window is not a factor of P or an occurrence of P is found. Aug. 29, 2011 Shift of SBNDM No factor: m P found: 1 Else: next alignment starts at the last factor Aug. 29, 2011 SBNDM, example P = banana, T = antanabadbanana... alignment: antanabadbanana a na ana Aug. 29, 2011 SBNDM, example P = banana, T = antanabadbanana... alignment: not a factor: next alignment: antanabadbanana a na ana tana antanabadbanana Aug. 29, 2011 SBNDM, example P = banana, T = antanabadbanana... alignment: not a factor: next alignment: not a factor: next alignment: antanabadbanana a na ana tana antanabadbanana d antanabadbanana Aug. 29, 2011 SBNDMq SBNDMq [6] is a tuned version of SBNDM. Processing of an alignment starts with checking a q-gram. Let q = 4. Consider an alignment at antana. Instead of testing four suffixes a, na, ana, tana, only tana is tested. Testing is done in a fast loop. Aug. 29, 2011 Forward-SBNDM Forward-SBNDM (FSB for short) by Faro & Lecroq [7] is a lookahead version of SBNDM2. Both FSB and SBNDM2 read a 2-gram x1x2 before a factor test. x1x2 is matched with the end of P in SBNDM2. Only x1 is matched with the end of P in FSB, and x2 is a lookahead character following the current alignment. FSB is faster than SBNDM2 for large alphabets. Aug. 29, 2011 Generalization of FSB: FSB(q,f) FSB(q,f) (= Forward-SBNDM(q,f)) is SBNDMq with f lookahead characters, f = 0, 1, ..., q-1. FSB(2,1) = FSB and FSB(q,0) = SBNDMq. Motivation: SBNDMq works well on modern processors also for q>2. Aug. 29, 2011 FSB(q,f) Let UV be a q-gram, where |V| = f. After reading UV there are 3 alternatives: i. If U is a suffix of P, reading continues leftwards. ii. Else if UV is a factor of P, reading continues leftwards. iii. Else the state vector is zero and P is shifted m-q+f+1 positions (f positions more than in SBNDMq). Aug. 29, 2011 Occurrence vectors in FSB(q,2) Example: P = banana SBNDMq: banana B[n] = 00001010 extra bits FSB(q,2): B[n] = B[a] = B[x] = Aug. 29, 2011 00101011 01010111 00000011 State vectors in FSB(q,2) for q=4 4-gram nanx: x n a n 00000011 00101011 01010111 00101011 00001000 nanx is not a factor 4-gram nanx xana anan State vector 00001000 00000000 01000000 Aug. 29, 2011 Conclusion na is a suffix of P not a factor factor of P Benefits / drawbacks of lookahead characters and extra bits Benefits • Longer shifts more speed • Combined suffix/factor test Drawback • More q-grams accepted less speed Aug. 29, 2011 Greedy skip loop for SBNDM2 (GSB2 = Greedy-SBNDM2) Factor tests of two 2-grams are done in one round. Let B2[x,y] denote the combined occurrence vector of characters x and y. B2[x,y] = B[x] & (B[y]<<1) next: D B2[ti,ti+1] if D = 0 then if B2[ti+m-1,ti+m] = 0 then i i+2*m-2 goto next Aug. 29, 2011 2-byte read Read two characters (= 2 bytes = 16 bits) in one instruction (in a skip loop). Suits well q-gram algorithms with even q. For experiments we made two versions of the algorithms: • Standard (1-byte read) • b-version using 2-byte read Aug. 29, 2011 2-byte read (cont.) Advantage: a part of computation can moved to preprocessing phase • Example: B2[x,y] = B[x] & (B[y]<<1) Speed-up factor even more than 2 Drawback: extra 0.1 ms for preprocessing. Aug. 29, 2011 4-byte read? Many border crosses happen => slow down 232 tables too big for practice Aug. 29, 2011 Experimental results/KJV Bible In the recent comparison S. Faro, T. Lecroq: The Exact String Matching Problem: a Comprehensive Experimental Evaluation (2010), the algorithms EBOM and Hash3 were the fastest in the bible text for m = 4,...,20. 4 8 16 Hash3 14.6 5.42 2.79 EBOM 6.53 3.87 2.91 Aug. 29, 2011 KJV: EBOM & Hash3 (on ThinkPad X61s) 4 3,5 3 GB/s 2,5 2 EBOM Hash3 1,5 1 0,5 0 4 8 12 m Aug. 29, 2011 16 20 KJV: EBOMb & Hash3b (with 2-byte read) added 4 3,5 3 GB/s 2,5 EBOM 2 EBOMb Hash3 1,5 Hash3b 1 0,5 0 4 8 12 m Aug. 29, 2011 16 20 KJV: SBNDM2b = FSB(2,0)b added 4 3,5 3 EBOM GB/s 2,5 EBOMb 2 Hash3 1,5 Hash3b FSB(2,0)b 1 0,5 0 4 8 12 m Aug. 29, 2011 16 20 KJV: GSB2b added 4 3,5 3 EBOM GB/s 2,5 EBOMb 2 Hash3 Hash3b 1,5 FSB(2,0)b 1 GSB2b 0,5 0 4 8 12 m Aug. 29, 2011 16 20 KJV: FSB(4,i)b added, i = 0,1,2 4 3,5 EBOM 3 EBOMb Hash3 GB/s 2,5 Hash3b 2 FSB(2,0)b 1,5 GSB2b FSB(4,0)b 1 FSB(4,1)b 0,5 FSB(4,2)b 0 4 8 12 m Aug. 29, 2011 16 20 KJV: Speed-up factors of 2-byte read GSB2 FSB(2,0) FSB(2,1) FSB(4,0) FSB(4,1) FSB(4,2) Hash3 EBOM 1.32 1.34 1.24 1.72 2.15 2.03 1.05 1.17 Aug. 29, 2011 Other experiments DNA and binary data was also tested. • Gain of lookahead characters or the greedy loop was smaller than with the bible data. Gain of 2-byte read was smaller with 64-bit code than with 32-bit code. Aug. 29, 2011 Conclusions Two new algorithms were presented: • FSB(q,f) • GSB2 The new algorithms are faster than earlier algorithms on English data: • GSB2 for m = 4, …, 8 • FSB(q,f) for m = 8, …, 20 2-byte read makes most string algorithms faster. Aug. 29, 2011 Web site for practical speed comparison cse.aalto.fi/stringmatching Aug. 29, 2011