www.scichina.com info.scichina.com www.springerlink.com New method in information processing for maintaining an efficient dynamic ordered set XIN ShiQing1 & WANG GuoJin1,2† 1 Institute of Computer Graphics and Image Processing, Zhejiang University, Hangzhou 310027, China; 2 State Key Laboratory of CAD & CG, Zhejiang University, Hangzhou 310027, China This paper investigates how to maintain an efficient dynamic ordered set of bit strings, which is an important problem in the field of information search and information processing. Generally, a dynamic ordered set is required to support 5 essential operations including search, insertion, deletion, max-value retrieval and next-larger-value retrieval. Based on previous research fruits, we present an advanced data structure named rich binary tree (RBT), which follows both the binary-search-tree property and the digital-search-tree property. Also, every key K keeps the most significant difference bit (MSDB) between itself and the next larger value among K ’s ancestors, as well as that between itself and the next smaller one among its ancestors. With the new data structure, we can maintain a dynamic ordered set in O(L) time. Since computers represent objects in binary mode, our method has a big potential in application. In fact, RBT can be viewed as a general-purpose data structure for problems concerning order, such as search, sorting and maintaining a priority queue. For example, when RBT is applied in sorting, we get a linear-time algorithm with regard to the key number and its performance is far better than quick-sort. What is more powerful than quick-sort is that RBT supports constant-time dynamic insertion/deletion. information processing, dynamic ordered set, algorithms and data structures, rich binary tree There are numerous circumstances where one needs to maintain a dynamic fully-ordered set[1] so that items, each with a key, can be handled according to their priorities[2,3] . Generally, an elaborate data structure is required to facilitate the frequent operations including 1) searching the given key; 2) inserting a new key; 3) deleting the given key; 4) retrieving the next-larger key; and 5) returning the maximum key[4,5] . So the dynamic ordered set problem covers such a wide range that search[6−11] , sorting[1,6,12−17,20−23] and maintaining a priority queue[17−19,24] are only special sub-problems, which were mostly studied separately[1,14] . In this paper, we devise a rich binary tree (RBT) for maintaining an efficient dynamic ordered set in O(L) time, where L is the word length of keys. The RBT is required to follow both the binary-searchtree property and the digital-search-tree property. Furthermore, the RBT keeps the most significant difference bit (MSDB) between each key K and Received May 15, 2007; accepted August 20, 2008 doi: 10.1007/s11432-009-0074-0 † Corresponding author (email: wanggj@zju.edu.cn) Supported by the National Natural Science Foundation of China (Grant No. 60873111), and the National Basic Research Program of China (Grant No. 2004CB719400) Citation: Xin S Q, Wang G J. New method in information processing for maintaining an efficient dynamic ordered set. Sci China Ser F-Inf Sci, 2009, 52(8): 1292–1301, doi: 10.1007/s11432-009-0074-0 its direct parent key K, as well as that between K and its indirect parent key K, where the indirect parent is defined as the last cornered key on the root-to-leaf path to K itself. We will see that RBT with such a definition has nice properties. For example, we have K ∈ [min(K, K), max(K, K)], [min(K, K), max(K, K)] inducing a sequence of nested intervals when K goes along any root-to-leaf path. Therefore, we can employ the advanced data structure to solve the problems about order, e.g., search, sorting and maintaining a priority queue. This will greatly raise the efficiency of processing information, and play an important role in the development of computer science. Section 1 defines RBT and section 2 gives 5 essential RBT-based algorithms to support a dynamic ordered set. We provide some application examples in section 3 and arrive at a conclusion in section 4. 1 Data Structure: RBT Trees, especially binary trees, are an important data structure in computer science and have numerous applications in search and sorting[1,14] . In this section, we propose a new data structure named rich binary tree (RBT), which is defined by three requirements. Here we might as well assume that the keys are of the same word length L and different from each other. Our conclusion can be easily extended to general cases. We define RBT as follows. Definition 1. Suppose T is a binary tree with n keys. We say T is an RBT if it satisfies: 1) For each key K of the tree T , and its left child Kl and its right Kr , we have Kl < K < Kr . 2) Along the root-to-leaf path to K, we can obtain a binary string S by taking down a 0 if moving to the left and a 1 if moving to the right. Then S must be a prefix of K. 3) Besides pointers to the children, we need to keep another two pointers for each key K. One points to the direct parent key K, and the other points to the indirect parent key K, which is defined as the last cornered key on the root-to-leaf path to K itself. At the same time, the most significant difference bits (MSDB) between K and K, K should be kept. Figure 1(a) gives an example of RBT. It contains 10 keys: 1, 2, 3, 4, 5, 6, 7, 12, 13 and 15. We can easily verify that it meets the above three requirements. For example, the root-to-leaf path from H to G has one left turn and two right turns, which induces a binary string 011, exactly the prefix of the key 7 (binary code: 0111). In addition, the solid-line arrows point to direct parents, while the dash-line arrows point to indirect parents. Figures 1(b)–1(d) show the building process of RBT: 1) creating a complete binary tree with its depth the same as the word length; 2) arranging all the keys on the bottom level according to the digitalsearch-tree property (see Figure 1(b)); and 3) in a bubble-like fashion, repeating filling in the empty node with the maximum key of its left subtree or the minimum key of its right subtree (see Figure 1(c) and (d)). Obviously, RBT exists but is not necessarily unique generally. RBT has more constraints than general binary trees, following both the binary-search-tree property and the digital-search-tree property. Furthermore, it keeps the internal relation between keys. Therefore nice properties could be expected. Theorem 1. Let T be an RBT and L be the word length. Then we have 1) T has a depth no more than L + 1. 2) In-order traversal gives a monotonic increasing sequence of keys. 3) Let K be a key of T , and K, K be respectively its direct parent key and its indirect parent key. Then K ∈ (min(K, K), max(K, K)). 4) Assume that the keys K1 , K2 , · · · , Km are distributed along a root-to-leaf path. Then their respective direct parent keys and indirect parent keys induce a sequence of nested intervals. That is, (min(K1 , K1 ), max(K1 , K1 )) ⊃ (min(K2 , K2 ), max(K2 , K2 )) ⊃ ··· ⊃ (min(Km , Km ), max(Km , Km )), where K = 0 if K has no direct parent; and ( 0, K > K; K= ∞, else, XIN S Q et al. Sci China Ser F-Inf Sci | Aug. 2009 | vol. 52 | no. 8 | 1292-1301 1293 Figure 1 An RBT from the keys 1, 2, 3, 4, 5, 6, 7, 12, 13 and 15. (a) An RBT: dash-line arrows point to direct parents, while solid- line arrows point to indirect parents; (b)–(d) in a bubble-like fashion, we can obtain an RBT different from (a). The rule is filling in the blank node with the minimum key in its right sub-tree or the maximum key in its left subtree. if K has no indirect parent. In fact, the proof to Theorem 1 is straightforward. The first proposition of Theorem 1 can be proved from the digital-search-tree property, while the other three depend on the binary-search-tree property. Taking Figure 1(a) as an example, we observe that along the path H→D→E→G→F, the sequence of nested intervals is: (0, ∞) ⊃ (0, 1100) ⊃ (0100, 1100) ⊃ (0101, 1100) ⊃ (0101, 0111). This shows that the nearer to the bottom level the keys are located, the closer they will be. Exactly speaking, we have the following lemma. Lemma 1. The more the most significant difference bit of two keys is on the right, the closer they will be; and vice versa. In detail, 1) if the three keys K1 , K2 , K3 satisfy K1 < K2 < K3 or K1 > K2 > K3 , then we have d(K1 , K2 ) > d(K1 , K3 ), where d(·, ·) denotes the MSDB, being 1 at the leftmost bit, 2 at the next leftmost bit, and so on. 1294 2) if d(K1 , K2 ) > d(K1 , K3 ), then K1 > K3 (resp., K1 < K3 ) implies K2 > K3 (resp., K2 < K3 ); if d(K1 , K2 ) = d(K1 , K3 ), then K1 > K3 (resp., K2 6 K3 ) implies K1 > K2 (resp., K1 6 K2 ). Combining Lemma 1 and the last conclusion of Theorem 1 together, we get the following corollary. Corollary 1. Suppose that K1 , K2 , · · · , Km is a sequence of root-to-leaf keys. Then the MSDB sequence {max(d(Ki , Ki ), d(Ki , Ki ))} is monotonically non-decreasing with regard to the subscript i. Corollary 1 reveals that in the sense of MSDB, keys of RBT are well arranged. As we know, the maintenance of a dynamic ordered set requires frequent comparisons between two keys. Generally, each comparison always begins at the leftmost bit, and mostly costs repeated bit operations. But we can greatly reduce calculation amount for comparison according to the MSDB information provided by RBT. This is just the crux of improving the XIN S Q et al. Sci China Ser F-Inf Sci | Aug. 2009 | vol. 52 | no. 8 | 1292-1301 efficiency of maintaining an ordered set. Algorithm 1 For the purpose of describing algorithms, we represent the node structure of RBT in a C++ fashion: RBTNode* nodeOfMinKey = root; // initialized to be the Retrieving the minimum key root node While (nodeOfMinKey->leftChild != NULL) { RBTNode nodeOfMinKey = nodeOfMinKey->leftChild; } { bitset<L> key; // L is the word length Return nodeOfMinKey; RBTNode* leftChild; // the left child RBTNode* rightChild; // the right child 2.2 RBTNode* directParent; // the direct parent Theorem 1 tells us that an in-order traversal induces a monotonically increasing sequence of keys. So the problem of finding the next-larger key can be converted into a problem of finding the in-order successor. For a given key, if its right subtree is not empty, then the minimum key of the right subtree is just the next-larger key. Otherwise, if the key itself is a left child, then its direct parent key is the right answer; and if it is a right child, then its indirect parent key is exactly what we want. The pseudo-code algorithm is described as follows. RBTNode* indirectParent; //the indirect parent int directCursor; //MSDB from its direct parent int indirectCursor; //MSDB from its indirect parent }. What we want to explain here is that the variables directCursor and indirectCursor behave differently from general integers. They only support 1-bit movement to the right. So there may be a certain implementation mechanism with a good performance. Returning the next-larger key Algorithm 2 2 RBT-based implementation of dynamic ordered sets Returning the next-larger key RBTNode* nodeOfNextKey = curNode->rightChild; //initialized to be the right child If (nodeOfNextKey == NULL) In this section, we discuss the detailed implementation algorithms of RBT-based dynamic ordered sets, so that the following essential tasks can be done in O(L) time: 1) retrieving the maximum/minimum key; 2) returning the nextlarger/next-smaller key; 3) searching the given key; 4) inserting a new key; and 5) deleting the given key. We will see that such operations have a time bound that linearly depends on the word length and has nothing to do with the key number. { If (curNode->directParent == NULL) Return NULL; If (curNode->directParent->leftChild == curNode) Return curNode->directParent; Return curNode->indirectParent; } Else { While (nodeOfNextKey->leftChild != NULL) nodeOfNextKey = nodeOfNextKey->leftChild; Return nodeOfNextKey; 2.1 Retrieving the minimum key Suppose that the RBT already exists. Here we consider how to retrieve the minimum (maximum) key. If the RBT is empty, we just do nothing. Otherwise, the minimum key is bound to exist. Obviously, if we take left-child way as possible, we can find the minimum key. Taking Figure 1(a) and (d) as an example, node A has the minimum key 1, while node J has the maximum 15. We give a pseudo-code algorithm as follows. Of course, Algorithm 1 runs in time O(L). } 2.3 Searching the given key Binary search trees, such as red-black trees, only serve as a tool for searching a key, with match check word by word from root to leaf. For example, to search the key 0110, we need to compare it with the root key. Since 0110<1100, comparison goes left way. So we compare it with nodes D, E, G, F in turn. At this point, node F just contains key 0110, and therefore the search process is over. XIN S Q et al. Sci China Ser F-Inf Sci | Aug. 2009 | vol. 52 | no. 8 | 1292-1301 1295 Different from general binary trees, RBT keeps also MSDB information. Thus we can reduce bit comparison amount and improve search efficiency. Assume that a, b, c are three keys with word length L, and d(·, ·) denotes the MSDB operation between two keys. Lemma 1 implies ( d(a, c) = min(d(a, b), d(b, c)), d(a, b) 6= d(b, c); d(a, c) > d(a, b), else. That is to say, if d(a, b) and d(b, c) are different, d(a, c) has already been computed; otherwise, the final cursor position of d(a, c) is located on the right of min(d(a, b), d(b, c)). So we may take min(d(a, b), d(b, c)) as the starting cursor position and slide cursor to the right until we find the MSDB between a and c. Since RBT has already kept MSDB information, it may improve search efficiency. As we know, an RBT is definitely a binary search tree. So when searching a key c, we first compare it with the root key. If they are equal, return; and if c is smaller, continue to compare it with the left child key; otherwise, compare it with the right child. Repeat such a process until key c reaches the bottom level. For an in-depth study, we assume that the search path of key c is as Figure 2(a) shows. We number the keys like this: denote the left-turn Figure 2 keys by a1 , a2 , · · · , am1 , and the right-turn keys by b1 , b2 , · · · , bm2 . Then we have a1 < a2 < · · · < am1 < c < bm2 < bm2 −1 < · · · < b1 . Before searching key c, we create a node Nc for it to keep MSDB information between c and the current key N.key on the search path. Since N already keeps the MSDB between N.key and N .key, here we intend to compute the MSDB between c and N.key with the MSDB between c and N .key being known. This is not difficult if using what we just discussed on how to slide cursor. Note that the keys along the root-to-leaf path satisfy a1 < a2 < · · · < am1 < c < bm2 < bm2 −1 < · · · < b1 , which implies d(c, a1 ) < d(c, a2 ) < · · · < d(c, am1 ) and d(c, b1 ) < d(c, b2 ) < · · · < d(c, bm2 ). The two inequalities show that the cursor should slide like Figure 2(b), in which solid-line arrows show the real cursor sliding path, while dash-line arrows show the change of the starting cursor position. Obviously, the use of the filed directCursor reduces bit comparison amount, and therefore improves search efficiency. Search of the key c. (a) Search path; (b) only use the field directCursor to optimize the search process, where solid arrows show the real cursor sliding path, while dash-line arrows show the skips of the cursor; (c) use both the fields directCursor and indirectCursor to further optimize the search process. 1296 XIN S Q et al. Sci China Ser F-Inf Sci | Aug. 2009 | vol. 52 | no. 8 | 1292-1301 However, the cursor may move to back, rather than slides one way. So we cannot bind the total sliding distance by O(L). To improve the search algorithm to O(L), we have to use another field indirectCursor. Let the current node on the rootto-leaf path be N , and its direct parent and indirect parent be respectively N and N . Then both key c and N.key are between N .key and N .key. But we still do not know exactly whether key c is between N .key and N.key, or between N.key and N .key. For this purpose, we need to make clear which of the two cursors Nc .directCursor and Nc .indirectCursor is on the right side. If Nc .directCursor is the right one, we take min(Nc .directCursor, N.directCursor) as the starting sliding position when we compare c Algorithm 3 and N.key. Or else, take min(Nc .indirectCursor, N.indirectCursor) as the starting position. It can also be proved from Lemma 1. Taking d(c, a1 ) < d(c, a2 ) < · · · < d(c, am1 ) and d(c, b1 ) < d(c, b2 ) < · · · < d(c, bm2 ) into account, we can prove that the total sliding distance has a bound L. In addition, the digitalsearch-tree property is also very important. Once the search path does not match the prefix of key c, we can infer that the search key does not exist. The detailed pseudo-code algorithm is as follows. Searching the given key RBTNode Nc ; // create a node for key c Nc .key = c; Nc .directCursor = Nc .indirectCursor = 0; int depthCursor = 0; // check if the search path matches the prefix bool fGoRight(true); // keep track of the indirect parent node of node Nc RBTNode* N = root; // the current node on the root-to-leaf path While (N != NULL) { bool fGoRightOld(fGoRight); // save a copy of fGoRight int diffPos; // the MSDB between key c and the current key on the root-to-leaf path // compute the beginning sliding position If (Nc .directCursor > Nc .indirectCursor) diffPos = min(Nc .directCursor, N .directCursor); Else diffPos = min(Nc .indirectCursor, N .indirectCursor); // if the cursor need a skip to back, the variable diffPos already gives the MSDB While (Nc .key.at(diffPos) != −1 && Nc .key.at(diffPos) == N .key.at(diffPos)) diffPos++ ; // search the MSDB between c and N .key If (Nc .key.at(diffPos) == −1) Break ; //N .key is what we want to find If (Nc .key.at(depthCursor) == Nc .key.at(diffPos)) // The sentence has two meanings here: // If Nc .key.at(diffPos)==1, then c is larger. So we continue search right-child way. // According to the digital-search-tree property, key c should match the search path. (to be continued on the next page ) XIN S Q et al. Sci China Ser F-Inf Sci | Aug. 2009 | vol. 52 | no. 8 | 1292-1301 1297 (Continued) { fGoRight = Nc .at(diffPos); // if Nc .at(diffPos) is 1, go right-child way If (fGoRight != fGoRightOld) Nc .indirectCursor = Nc .directCursor; // At a turning corner, update the indirect parent Nc .directCursor = diffPos ; // node N will become the direct parent of Nc in the next step If (fGoRight) N = N .rightChild; Else N = N .leftChild; } Else { N = NULL; // the tree doesn’t contain key c, so terminate. } depthCursor++; // go to the deeper level } Return N ; // if N is not NULL, we succeed in searching key c; or else, the tree doesn’t contain c. It is easy to find that all the properties of RBT are used in Algorithm 3. The fields directCursor and indirectCursor play an important role in obtaining an O(L)-time algorithm. As Figure 2(c) shows, every sliding always begins at a larger position, while every backward skip directly answers the MSDB. This ensures that Algorithm 3 runs in time O(L). 2.4 Inserting a new key In this section, we discuss how to insert a new key c without loss of the properties of RBT. Naturally, we will use Algorithm 3 to locate key c. Then we push the affected keys down. The last step is to rebuild the fields directCursor and indirectCursor. The process is shown in Figure 3, where the first subfigure shows the simplest case and the other ones show a general insertion process. First, we use Algorithm 3 to give a location where key c is intended to insert. Three cases may happen. If key c already exists in the tree, stop insertion; if the intended location is at the bottom level, then insert c as a leaf (see Figure 3(a)). Otherwise, there must be a node that blocks the way upon pushing c down. This case results from the mismatch of the search path and the prefix of 1298 c. Taking Figure 3(b) as an example, we assume c < a1 . To ensure that the tree has a biary-searchtree property, key c should go right way. However, this will destroy the digital-search-tree property. At this point, we have to insert key c at the location of node a1 and push a1 down, such that a1 becomes the minimum key in the right subtree of c. Similarly, if a1 is blocked the way by another node, just do in this way as Figure 3(c) and (d) show. Of course, we must maintain the fields directParent, indirectParent, directCursor and indirectCursor. The pseudo-code algorithm is as follows. Note that Algorithm 4 only describes the sketchy insertion process. In practice, we implement the function using 167-line code. To analyze the time complexity, we observe that for the affected nodes, at most one of the fields directCursor and indirectCursor (accordingly, directParent and indirectParent) needs changing and its ultimate MSDBs are monotonically increasing along the root-to-leaf path. For example, in Figure 3(b)–(d), we have d(c, a2 ) 6 d(c, a3 ) 6 d(a1 , a3 ) 6 d(a1 , a5 ) 6 d(a1 , a6 ) 6 d(a4 , a6 ). According to the discussion in the previous sec- XIN S Q et al. Sci China Ser F-Inf Sci | Aug. 2009 | vol. 52 | no. 8 | 1292-1301 Figure 3 Insert a new key c. (a) The simplest case is to insert the key as a leaf; (b) the key c is intended to be inserted at the location of the branch node a1 ; (c) push the node a1 down until the node a4 block the way; (d) the insertion process ends with the node a4 being inserted as a leaf. Algorithm 4 Inserting a new key RBTNode Nc ; // create a node for the new key c Nc .key=c; Nc .directCursor=Nc .indirectCursor=0; //use Algorithm 3 to locate key c Case 1: key c already exists and therefore exits. Case 2: Nc can be inserted as a leaf as Figure 3(a) shows. Case 3: Nc is blocked the way by another node Nb , then do as Figure 3(b)–(d) show: Step 1. Replace Nb with Nc , and push Nb down. Step 2. If Nb can be inserted as a leaf, just insert it and exit; otherwise, go to step 1. tion, we conclude that the total cursor sliding distance for recomputing the affected MSDBs can be bounded by L. Therefore, Algorithm 4 runs in time O(L). 2.5 Deleting a given key Deleting a given key is the inverse process of inserting a new key. It has two steps, namely searching the given key and filling in the node, as is shown in Figure 4. The easiest case is to delete a leaf node as Figur 4(a) shows. At this point, we only need to change its parent’s fields leftChild and rightChild. Otherwise, the to-be-deleted node is a branch node, one of whose left subtree and right subtree is not empty. Without loss of generality, the to-be-deleted node a1 has a non-empty right subtree, as Figure 4(b) shows. So, after node a1 is removed, we try to fill in the blank node with the minimum key of the right subtree such that the binary-search-tree property is still satisfied. But perhaps this will yield another blank node. We need to repeat such a process until the blank node happens as a leaf. In coding, we create a stack in advance to collect the to-be-moved nodes. The pseudo-code algorithm is as follows. What is extremely similar to Algorithm 3 is that the affected fields directCursor and indirectCursor are monotonically increasing. Taking Figure 4(c) as an example, we observe that d(a4 , a2 ) 6 d(a4 , a3 ) 6 d(a6 , a3 ) 6 d(a6 , a5 ) 6 d(a8 , a5 ) 6 d(a8 , a7 ). So Algorithm 5 can prove an O(L)-time algorithm. Well, we have finished the 5 essential algorithms for maintaining a dynamic ordered set. And they all can run in time O(L), so we conclude that RBT-based dynamic ordered sets have an O(L)time complexity. XIN S Q et al. Sci China Ser F-Inf Sci | Aug. 2009 | vol. 52 | no. 8 | 1292-1301 1299 Figure 4 Delete a given key. (a) The easiest case is to delete a leaf; (b) delete a branch node; (c) in a bubble-like style, we fill in the blank node with the minimum key in the right subtree or the maximum key in the left subtree. Algorithm 5 Deleting a given key Step 1. Search the given key. // Suppose that the to-be-deleted node is a branch node. Step 2. Following the rule of filling in the blank node with either the minimum key of the right subtree or the maximum key of the left subtree, we collect the to-be-moved keys into a stack. Step 3. Repeat poping out the top element and putting it into the right blank node. Step 4. Recompute the affected fields directCursor and indirectCursor. Figure 5 Comparison between quicksort and RBT-based sorting. The keys are respectively 64-bit long and 128-bit long, and the key numbers are respectively 1000, 2000, 4000, 8000, 16000, 32000, 64000 and 128000. 3 Applications In section 2, we have described the essential algorithms for maintaining a dynamic ordered set. In fact, many classic problems, such as those problems of how to search, sort and maintain a priority queue, can be solved with these algorithms. Here, we make a test on sorting problem to show that 1300 our algorithm outperforms traditional sorting algorithms. As far as the average performance is concerned, quick-sort[12] is thought to be the best sorting algorithm. So we compare the performance difference between our algorithm and quick-sort here. The test is made on a computer of 3.00 GHz Pentium(R) 4 CPU and 2.0 GB RAM. The keys are respectively 64-bit long and 128-bit long, and the key numbers are respectively 1000, 2000, 4000, 8000, 16000, 32000, 64000 and 128000. The experimental results can be seen in Figure 5. For example, when the keys are 128-bit long and as many as 128000, quick-sort costs 3140 ms, while RBT-based sorting only costs 1376 ms. Furthermore, the more the keys are, the bigger advantage RBT-based sorting has. And its running time linearly increases with an increase in the key number. We also believe that data structures such as sets and maps in programming language can be implemented based on RBT. 4 Conclusion This paper presents an important data structure XIN S Q et al. Sci China Ser F-Inf Sci | Aug. 2009 | vol. 52 | no. 8 | 1292-1301 named RBT, which has both the binary-searchtree property and the digital-search-tree property. Moreover, RBT keeps the internal relation between keys. Then we given 5 essential algorithms for maintaining a dynamic ordered set, i.e. 1) retrieving the maximum/minimum key; 2) returning the next-larger/next-smaller key; 3) searching the given key; 4) inserting a new key; and 5) delet- ing the given key. They all can run in time O(L), where L is the word length. Experimental results show that RBT-based algorithms have a good performance. We believe that the new data structure RBT and RBT-based algorithms will enable us to solve problems concerning order with high efficiency. 1 Corman T, Leiserson C, Rivest R, et al. Introduction to Algorithms. 2nd ed. Cambridge: MIT Press, 1990. 123–320 2 Li X M, Garzarán M J, Padua D. A dynamically tuned sorting library. In: CGO ’04: Proceedings of the International Symposium on Code Generation and Optimization. Palo Alto, California, 2004. 111 3 Graefe G. Implementing sorting in database systems. ACM Comput Surv, 2006, 38(3): 10 4 Andersson A, Thorup M. Dynamic ordered sets with exponential search trees. J ACM, 2007, 54(3): 13 5 Blandford D K, Blelloch G E. Compact representations of ordered sets. In: SODA ’04: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, 2004. 11–19 6 Sedgewick R. Algorithms. 2nd ed. Massachusetts: AddisonWesley, 1983. 91–170 7 Andersson A, Hagerup T, Håstad J, et al. The complexity of searching a sorted array of strings. In: STOC ’94: Proceedings of the twenty-sixth Annual ACM Symposium on Theory of Computing, Québec, 1994. 317–325 8 Bentley J L, Sedgewick R. Fast algorithms for sorting and searching strings. In: SODA ’97: Proceedings of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, 1997. 360–369 9 Siegel D E. All searches are divided into three parts: string searches using ternary trees. In: APL ’98: Proceedings of the APL98 Conference on Array Processing Language, Rome, 1998. 57–68 10 Brodal G S. Finger search trees with constant insertion time. In: SODA ’98: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, 1998. 540– 549 11 Andersson A, Thorup M. Dynamic string searching. In: SODA ’01: Proceedings of the Twelfth Annual ACM-SIAM Sympo- sium on Discrete Algorithms, Washington, 2001. 307–308 12 Hoare C A R. Algorithm 64: Quicksort. Commun ACM, 1961, 4(7): 321 13 Williams J W J. Algorithm 232: Heapsort. Commun ACM, 1964, 7: 347–348 14 Knuth D E. Fundamental Algorithms. 3rd ed. Massachusetts: Addison-Wesley, 1997. 1–650 15 Penttonen M, Katajainen J. Notes on the complexity of sorting in abstract machines. BIT, 1985, 25(4): 611–622 16 Han Y J. Deterministic sorting in O(nlog log n) time and linear space. In: STOC ’02: Proceedings of the Thirty-fourth Annual ACM Symposium on Theory of Computing, Quebec, 2002. 602–608 17 Thorup M. Integer priority queues with decrease key in constant time and the single source shortest paths problem. J Comput Syst Sci, 2004, 69(3): 330–353 18 Thorup M. On RAM priority queues. SIAM J Comput, 2000, 30(1): 86–109 19 Arge L, Bender M A, Demaine E D, et al. Cache-oblivious priority queue and graph algorithm applications. In: STOC ’02: Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing, Québec, 2002. 268–276 20 Yang L, Song T. The array-based bucket sort algorithm. J Comput Res Devel, 2007, 44(2): 341–347 21 Yang J W, Liu J. Quick page sorting algorithm based on quick sorting. Comput Eng, 2005, 31(4): 82–84 22 Zhong H, Chen Q H, Liu G S. A byte-guick sorting algorithm. Comput Eng, 2002, 28(12): 39–40 23 Huo H W, Xu J. A study on quicksort algorithm. Microelectr Comput, 2002, 19(6): 6–9 24 Tang W T, Mong Goh R S, Thng I L. Ladder queue: An O(1) priority queue structure for large-scale discrete event simulation. ACM Trans Model Comput Simul, 2005, 15(3): 175–204 XIN S Q et al. Sci China Ser F-Inf Sci | Aug. 2009 | vol. 52 | no. 8 | 1292-1301 1301