Tries 1 Standard Tries: is an ordered tree with the following properties Each node of tree, except the root is labelled with a character Ordering of children of an internal node is determined by canonical order Path from root to a leaf node yields a string 2 1 𝑆𝑡𝑟𝑖𝑛𝑔𝑠 = {𝑏𝑒𝑎𝑟, 𝑏𝑒𝑙𝑙, 𝑏𝑖𝑑, 𝑏𝑢𝑙𝑙, 𝑏𝑢𝑦, 𝑠𝑒𝑙𝑙, 𝑠𝑡𝑜𝑐𝑘, 𝑠𝑡𝑜𝑝} b e s i a l r l u l d e t l o y l c l p k For s strings in set, there will be s leaf nodes Height of the tree is equal to length of the longest string 3 Compressed Tries: is similar to standard tries, but each internal node has at least two children 𝑆𝑡𝑟𝑖𝑛𝑔𝑠 = {𝑏𝑒𝑎𝑟, 𝑏𝑒𝑙𝑙, 𝑏𝑖𝑑, 𝑏𝑢𝑙𝑙, 𝑏𝑢𝑦, 𝑠𝑒𝑙𝑙, 𝑠𝑡𝑜𝑐𝑘, 𝑠𝑡𝑜𝑝} b e ar s u id ll ll to ell y ck p For s strings in set, there will be s leaf nodes Every internal node has at least two children 4 2 Compressed Tries: is similar to standard tries, but each internal node has at least two children 𝑆𝑡𝑟𝑖𝑛𝑔𝑠 = {𝑏𝑒𝑎𝑟, 𝑏𝑒𝑙𝑙, 𝑏𝑖𝑑, 𝑏𝑢𝑙𝑙, 𝑏𝑢𝑦, 𝑠𝑒𝑙𝑙, 𝑠𝑡𝑜𝑐𝑘, 𝑠𝑡𝑜𝑝} 0,0,0 b 0,1,1 e 0,2,3 ar 5,0,0 s 3,1,1 u 2,1,2 id 1,2,3 ll 3,2,3 ll 6,1,2 to 5,1,3 ell 4,2,2 y 6,3,4 ck 7,3,3 p 0 1 2 3 S[0] b e a r S[1] b e l l S[2] b i d S[3] b u l S[4] b u y S[5] s e l l S[6] s t o c S[7] s t o p 4 l k Each Node is a tuple, first element is the index of string, second element is start index of the character and the Third element is the end character of the string. 5 Suffix Tries: is a trie where the strings in the collection are all suffixes 𝑆𝑡𝑟𝑖𝑛𝑔 = "𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒" 𝑆𝑡𝑟𝑖𝑛𝑔𝑠 = {𝑒, 𝑧𝑒, 𝑖𝑧𝑒, 𝑚𝑖𝑧𝑒, 𝑖𝑚𝑖𝑧𝑒, 𝑛𝑖𝑚𝑖𝑧𝑒, 𝑖𝑛𝑖𝑚𝑖𝑧𝑒, 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒} 𝑆𝑢𝑓𝑓𝑖𝑥𝑒𝑠 𝑒 𝑧𝑒 𝑖𝑧𝑒 𝑚𝑖𝑧𝑒 𝑖𝑚𝑖𝑧𝑒 𝑛𝑖𝑚𝑖𝑧𝑒 𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 i e m n z i i e z e m n z i i e n z m m i e i i m z z i e e z e 6 3 Suffix Tries: is a trie where the strings in the collection are all suffixes 𝑆𝑡𝑟𝑖𝑛𝑔 = "𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒" 𝑆𝑡𝑟𝑖𝑛𝑔𝑠 = {𝑒, 𝑧𝑒, 𝑖𝑧𝑒, 𝑚𝑖𝑧𝑒, 𝑖𝑚𝑖𝑧𝑒, 𝑛𝑖𝑚𝑖𝑧𝑒, 𝑖𝑛𝑖𝑚𝑖𝑧𝑒, 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒} 𝑆𝑢𝑓𝑓𝑖𝑥𝑒𝑠 𝑒 𝑧𝑒 𝑖𝑧𝑒 𝑚𝑖𝑧𝑒 𝑖𝑚𝑖𝑧𝑒 𝑛𝑖𝑚𝑖𝑧𝑒 𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 i e m n z i i e z e m n z i i e n z m m i e i i m z z i e e z e 7 Suffix Tries: is a trie where the strings in the collection are all suffixes 𝑆𝑡𝑟𝑖𝑛𝑔 = "𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒" 𝑆𝑡𝑟𝑖𝑛𝑔𝑠 = {𝑒, 𝑧𝑒, 𝑖𝑧𝑒, 𝑚𝑖𝑧𝑒, 𝑖𝑚𝑖𝑧𝑒, 𝑛𝑖𝑚𝑖𝑧𝑒, 𝑖𝑛𝑖𝑚𝑖𝑧𝑒, 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒} 𝑆𝑢𝑓𝑓𝑖𝑥𝑒𝑠 𝑒 𝑧𝑒 𝑖𝑧𝑒 𝑚𝑖𝑧𝑒 𝑖𝑚𝑖𝑧𝑒 𝑛𝑖𝑚𝑖𝑧𝑒 𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 i e mize nimize m nimize ze i ze nimize ze 8 4 Suffix Tries: is a trie where the strings in the collection are all suffixes 𝑆𝑡𝑟𝑖𝑛𝑔 = "𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒" 𝑆𝑡𝑟𝑖𝑛𝑔𝑠 = {𝑒, 𝑧𝑒, 𝑖𝑧𝑒, 𝑚𝑖𝑧𝑒, 𝑖𝑚𝑖𝑧𝑒, 𝑛𝑖𝑚𝑖𝑧𝑒, 𝑖𝑛𝑖𝑚𝑖𝑧𝑒, 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒} 𝑆𝑢𝑓𝑓𝑖𝑥𝑒𝑠 𝑒 𝑧𝑒 𝑖𝑧𝑒 𝑚𝑖𝑧𝑒 𝑖𝑚𝑖𝑧𝑒 𝑛𝑖𝑚𝑖𝑧𝑒 𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 i e mize m nimize nimize ze i ze ze nimize 9 Suffix Tries: is a trie where the strings in the collection are all suffixes 𝑆𝑡𝑟𝑖𝑛𝑔 = "𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒" 𝑆𝑡𝑟𝑖𝑛𝑔𝑠 = {𝑒, 𝑧𝑒, 𝑖𝑧𝑒, 𝑚𝑖𝑧𝑒, 𝑖𝑚𝑖𝑧𝑒, 𝑛𝑖𝑚𝑖𝑧𝑒, 𝑖𝑛𝑖𝑚𝑖𝑧𝑒, 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒} 𝑆𝑢𝑓𝑓𝑖𝑥𝑒𝑠 𝑒 𝑧𝑒 𝑖𝑧𝑒 𝑚𝑖𝑧𝑒 𝑖𝑚𝑖𝑧𝑒 𝑛𝑖𝑚𝑖𝑧𝑒 𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 i e mize nimize mi ze nimize nimize ze ze 10 5 Suffix Tries: is a trie where the strings in the collection are all suffixes 𝑆𝑡𝑟𝑖𝑛𝑔 = "𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒" 𝑆𝑡𝑟𝑖𝑛𝑔𝑠 = {𝑒, 𝑧𝑒, 𝑖𝑧𝑒, 𝑚𝑖𝑧𝑒, 𝑖𝑚𝑖𝑧𝑒, 𝑛𝑖𝑚𝑖𝑧𝑒, 𝑖𝑛𝑖𝑚𝑖𝑧𝑒, 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒} 𝑆𝑢𝑓𝑓𝑖𝑥𝑒𝑠 𝑒 𝑧𝑒 𝑖𝑧𝑒 𝑚𝑖𝑧𝑒 𝑖𝑚𝑖𝑧𝑒 𝑛𝑖𝑚𝑖𝑧𝑒 𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 i e mize nimize mi ze 4,7 2,7 0 1 2 3 4 5 6 7 m i n i m i z e 0,1 6,7 ze ze nimize 1,1 7,7 nimize 2,7 2,7 6,7 6,7 Each Node is a tuple, where first element is the start index of the character and the second element is the end character of the string. 11 World wide web contains a huge collection of text documents Information is gathered by using web crawler Search engine allows users to retrieve relevant information (keywords) Information stored by search engine is a dictionary called inverted index Words in the dictionary are called index terms Array stores occurrence of list of terms Compressed trie is used for set of index terms 12 6