Data Structures for Databases Feb, 2008 He Tan hetan@ida.liu.se IISLAB IDA He Tan hetan@ida.liu.se IISLAB IDA Overview Real world Queries Model Data Structures for Databases Answers Databases He Tan Processing of queries and updates DBMS Access to stored data Physical database september 2008 1 1 He Tan hetan@ida.liu.se IISLAB IDA 2 2 He Tan hetan@ida.liu.se IISLAB IDA Assume a data file EMPLOYEE with 1,000,000 records of size 100 byte and block size of 4,096 bytes. (25,000 blocks) access a record under a certain query condition september 2008 4096 = 40 bfr = 100 1000000 b= = 25000 40 Assume a data file EMPLOYEE with 1,000,000 records of size 100 byte and block size of 4,096 bytes. 4096 = 40 bfr = 100 (25,000 blocks) 1000000 = 25000 b= 40 select * from EMPLOYEE where SSN = ’1234567890’; access a record under a certain query condition select * from EMPLOYEE where NAME = ’ Wood, Donald’; • If the data file is ordered on the key field SSN, do a binary search log 2 b = log 2 25000 = 15 • 25000 = 12500NAME (we • The file is not ordered on the key field 2 assume that every employee has an unique name), do linear search If a primary index is specified on the ordering field SSN (index size 10 bytes), do a binary search on index 4096 = 409, bfri = 10 25000 bi = = 62 409 4096 bfri = = 409, 10 log 2 bi = log 2 62 = 6, 6 + 1 = 7 september 2008 He Tan hetan@ida.liu.se IISLAB IDA • If a secondary index is specified on the non-ordering log b = log 2445 = 12, 12 + 1 = 13 field NAME (index size 10 bytes), do a binary search on index 2 3 3 september 2008 4 He Tan hetan@ida.liu.se IISLAB IDA What do we learn? 1000000 bi = = 2445 409 i 2 4 Multilevel Indexes • How to make more efficient kinds of indexes Multilevel indexing Index on mutiple keys Hashing techniques ”Index on the index” september 2008 5 TDDB38/TDDI60 - HT 2004 5 september 2008 6 6 1 Data Structures for Databases He Tan hetan@ida.liu.se IISLAB IDA Feb, 2008 He Tan hetan@ida.liu.se IISLAB IDA Multilevel indexes • ”Index on the index” • The first (base) level index can be primary, clustering and secondary indexes as long as the base level index has a distinct index value for every entry i.e. i th index is a primary index (i > 1) • Multilevel indexes • reduce the search space of the index by blocking factor for the index. • The value blocking factor is called as fan out (fo). log b How many levels ? Until all entries of the top level fit in one block B foi = Ri foi september 2008 7 7 He Tan hetan@ida.liu.se IISLAB IDA 8 8 He Tan hetan@ida.liu.se IISLAB IDA Assume a data file EMPLOYEE with 1,000,000 records of size 100 byte and block size of 4,096 bytes. (25,000 blocks) 4096 access a record under a certain query condition september 2008 Problems with Multilevel Indexes = 40 bfr = 100 1000000 = 25000 b= 40 select * from EMPLOYEE where SSN = ’1234567890’; Addition, deletion and update may be very expensive All levels are ordered files. If a primary index is specified on the ordering field SSN (index size 10 bytes), do a binary search on index • 4096 bfr = = 409, 10 Solutions: 25000 b= = 62 409 • overflow file/markers + periodic reorganization. • a dynamic multilevel index structure log 2 b = log 2 62 = 6, 6 + 1 = 7 • september 2008 Multilevel index 9 9 He Tan hetan@ida.liu.se IISLAB IDA Implemented by B-trees data structure log 409 25000 = 2, 2 + 1 = 3 september 2008 10 10 Search Tree • A search tree is a tree that is used to guide the search for a record. • An ordinary search tree of order p consist of nodes that have at most p-1 search values and p pointers. Search Tree P1 K1 ... X < K1 P1 K1 ... K i −1 Pi Ki ... K i −1 ... Ki Pi K i −1 < X < K i K q −1 Pq P1 K1 ... K i −1 Pi Pq K q −1 < X Ki ... K q −1 Pq <P1, K1, P2, K2, …, Pq-1, Kq-1, Pq> P1 K1 where q≤p and Pi is a pointer to a child node (or a null pointer) K q −1 ... K i −1 Pi Ki ... K q −1 Pq 1. Within each node, K1 < K2 < … < Kq-i 2. For all values X in the subtree pointed by Pi: If 1< i < q, Ki-1 < X < Ki If i = 1, X < K1 If i = q, Kq-1 < X september 2008 11 TDDB38/TDDI60 - HT 2004 11 2 Data Structures for Databases Feb, 2008 He Tan hetan@ida.liu.se IISLAB IDA Search Tree: Example, order p=3 B-Tree B-tree (Balanced tree) all leaves are on the same level Æ the depth of the tree is as small as possible each search value Ki has associated a pointer Pri to the data record with search value , in addition to the node pointer Pi all nodes except the root and leaves have at most p pointers and at least p / 2 pointers september 2008 He Tan hetan@ida.liu.se IISLAB IDA He Tan hetan@ida.liu.se IISLAB IDA B-Tree: Example, order p=3 14 14 B-tree: Order One node must fit in one block: p ⋅ Pblock + ( p − 1) ⋅ ( Precord + K ) ≤ B ⇒ p ≤ p Pblock Precord K september 2008 15 15 He Tan hetan@ida.liu.se IISLAB IDA Given: B = 4096 bytes, Precord = 16 bytes, Pblock = 8 bytes, K = 64 bytes, fill percentage = 70% B+-tree • Data pointers only stored in leaf nodes. Nodes Level1 16 • A variation of B-tree. Most commonly used. Æ p <= 47 Root order, number of block pointer entries in a node size of a block pointer size of a record pointer size of a search key field 16 He Tan hetan@ida.liu.se IISLAB IDA B-tree: Number of entries • september 2008 B+Precord+K Pblock+Precord+K 1 33 Pointers 0.7*47≈33 33*33=1089 • The leaf nodes are usually linked to provide ordered access. Entries 33-1=32 33*32=1056 Level2 1089 333 =35,937 332 *32=34,848 Level3 35,937 334 =1,185,921 333 *32=1,149,984 The number of entries can be hold in a 3 level B-tree: 1,149,984 september 2008 17 TDDB38/TDDI60 - HT 2004 17 september 2008 18 18 3 Data Structures for Databases Feb, 2008 He Tan hetan@ida.liu.se IISLAB IDA B+-Tree: Example, order p=3, pleaf=2 Order of insertion: 8, 5, 1, 7, 3, 12, 9, 6 8 5 1 7 3 12 9 6 5 3 7 8 B+-trees of order p: Internal nodes K1 Andersson Hagberg French Silver Daniels Young Zhang Baker 1. Each internal node is of the form 2. Within each internal node K1 < K2 < … < Kq-i 3. For all search field values X in the subtree pointed at by Pi, we have <P1, K1, P2, K2, …, Pq-1, Kq-1, Pq> Ki-1< X ≤ Ki 1 5 3 6 8 7 9 X ≤ Ki Ki-1 < X 12 for 1 < i < q for i = 1 for i=q P1 ... K1 He Tan hetan@ida.liu.se IISLAB IDA 20 B+-trees of order p : Leaf nodes 1. 1 Each leaf node is of the form 3 4. Each internal node has at most p tree pointers. 5. Each internal node, except the root, has at least 2. Within each leaf node K1 < K2 < … < Kq-i p / 2 tree pointers. The root node has at least two tree pointers if it is an internal nodes. 3. Each entry contains a pointer to the record whose search field value corresponds to the entry. An internal node with q pointers (q≤ p), has q-1 search field values. 4. Each leaf node has at least pleaf / 2 values. 5. All leaf nodes are at the same level. 6. P1 K1 ... Ki −1 Pi Ki ... <<K1, P1>, <K2, P2>, …, <Kq-1, Pq-1>, Pnext> K q−1 Pq K1 Ki −1 < X ≤ Ki X ≤ K1 Pr1 ... Ki Pri ... K q −1 Pq Pnext K q −1 < X 21 21 He Tan hetan@ida.liu.se IISLAB IDA K q −1 < X leaf K1 september 2008 K q−1 Pq 20 He Tan hetan@ida.liu.se IISLAB IDA B+-trees of order p: Internal nodes ... Ki Ki −1 < X ≤ Ki X ≤ K1 september 2008 Ki −1 Pi september 2008 He Tan hetan@ida.liu.se IISLAB IDA B+-tree Order 22 22 B+-trees • Given: B=4096 bytes, One internal node must fit in one block: ⇒ p ≤ p ⋅ Pblock + ( p − 1) ⋅ K ≤ B Precord=16 bytes, Pblock=8 bytes, K=64bytes, B+ K Pblock + K fill percentage=70% Æ p <= 57, pleaf<=51 Nodes One leaf node must fit in one block: p leaf ⋅ ( Precord + K ) + Pblock ≤ B ⇒ pleaf ≤ september 2008 23 TDDB38/TDDI60 - HT 2004 B p pleaf Pblock K Precord Root B − Pblock Precord + K block size order, number of pointer entries in an internal node number of record pointer entries in a leaf node size of a block pointer size of a search key field 23 size of a record pointer Level1 Level2 Leaf level Pointers Entries ≈ 40 40-1=39 40 40*40=1600 40*39=1560 1600 403 =64,000 402 *39=62,400 1 0.7*57 Record pointers 64,000 64,000*0.7*51=2,284,800 the number of entries can be hold in the 3-level B-tree: 1,149,984 september 2008 24 24 4 Data Structures for Databases He Tan hetan@ida.liu.se IISLAB IDA Feb, 2008 He Tan hetan@ida.liu.se IISLAB IDA B+-trees Search B+-Tree Search Search: 8 • Very fast searching in the index structure: log p⋅ f N N p f september 2008 7 3 number of search values order, number of block pointers per node fill factor, 0≤f≤1 1 25 25 He Tan hetan@ida.liu.se IISLAB IDA 5 september 2008 5 6 7 8 9 12 26 26 He Tan hetan@ida.liu.se IISLAB IDA B+-trees Insertion and Deletion 3 8 B+-tree: Insertion When a leaf node is full, it causes an overflow • Insertion and deletion can be expensive. pleaf + 1 The first 2 entries in the node are kept there, the remaining move to a new leaf. p + 1 The search value of 2 th node move up to the parent. If the parent is full, it will overflow. The resulting split can propagate all the way up to the root. leaf When an internal node is full, it causes an overfloe september 2008 27 27 B+-Tree ( p=3, pleaf=2 ) september 2008 p + 1 j= , < P1 , K1 , P2 ,..Pj −1 , K j −1 , Pj > 2 < Pj +1 , K j +1 ,...Pp , K p , Pp +1 > Kj are kept there move to a new internal node move up to a new level. 28 28 B+-Tree ( p=3, pleaf=2 ) 8 Insert: 8 TDDB38/TDDI60 - HT 2004 Insert: 5 5 Data Structures for Databases Feb, 2008 B+-Tree ( p=3, pleaf=2 ) 5 8 B+-Tree ( p=3, pleaf=2 ) Overflow – create a new level 5 1 Insert: 1 5 8 Insert: 7 B+-Tree ( p=3, pleaf=2 ) B+-Tree ( p=3, pleaf=2 ) 5 1 Overflow - Split 5 7 3 8 1 3 5 5 7 8 Overflow - Split Propagates to a new level Insert: 3 Insert: 12 B+-Tree ( p=3, pleaf=2 ) B+-Tree ( p=3, pleaf=2 ) 5 8 3 1 3 5 5 7 8 8 3 12 1 3 5 7 8 9 12 Overflow – Split Insert: 9 TDDB38/TDDI60 - HT 2004 Insert: 6 6 Data Structures for Databases Feb, 2008 He Tan hetan@ida.liu.se IISLAB IDA B+-Tree ( p=3, pleaf=2 ) B+-tree: Deletion 5 When a leaf node is less than haf full it causes an underflow 7 3 1 3 5 6 7 Redistribute/ merge with sibling, The resulting combining can also propagate to internal nodes. 8 8 9 12 Resulting B+-tree september 2008 38 38 B+-Tree ( p=3, pleaf=2 ) B+-Tree ( p=3, pleaf=2 ) 7 1 6 1 5 7 9 6 7 1 8 9 12 1 6 6 9 7 8 9 12 Underflow - redistribute Delete: 5 Delete: 12 B+-Tree ( p=3, pleaf=2 ) B+-Tree ( p=3, pleaf=2 ) 7 1 1 6 6 7 8 7 1 8 9 1 6 6 8 7 8 Underflow Delete: 9 TDDB38/TDDI60 - HT 2004 merge with the left propagate reduce the tree levels 7 Data Structures for Databases Feb, 2008 He Tan hetan@ida.liu.se IISLAB IDA B+-Tree ( p=3, pleaf=2 ) 1 What do we learn? 6 • How to make more efficient kinds of indexes 1 6 7 Multilevel indexing 8 Index on mutiple keys Hashing september 2008 He Tan hetan@ida.liu.se IISLAB IDA He Tan hetan@ida.liu.se IISLAB IDA Indexes on Multiple Keys 44 44 select * from EMPLOYEE where DEPT = ‘CS’ and AGE = ’40’ Indexes on Multiple Keys • Possible strategies for processing this query using indices on single attributes: september 2008 • use index on dept to find employee with dept = ‘CS’, then test them individually to see if age = ’40’ • use index on age to find employee with age = ’40’, then test them individually to see if dept = ‘CS’ • use dept index to find pointers to all records of the CS department, and use age index similarly, then take intersection of both sets of pointers 45 45 He Tan hetan@ida.liu.se IISLAB IDA ordered index on multiple attributes, treat the composite as a single value september 2008 46 46 He Tan hetan@ida.liu.se IISLAB IDA What do we learn? If the set of records that matches each condition is large, but the combination is not, an index on the composite may be useful. External Hashing • How to make more efficient kinds of indexes Multilevel indexing search key field Index on mutiple keys hashing function Hashing techniques september 2008 47 TDDB38/TDDI60 - HT 2004 47 september 2008 48 48 8 Data Structures for Databases He Tan hetan@ida.liu.se IISLAB IDA Feb, 2008 He Tan hetan@ida.liu.se IISLAB IDA External Hashing Extendible Hashing • Additional access structure: directory of pointers to buckets d’=2 Handling overflow for buckets by chaining Insert 14 4* 24* 12* 16* d’=1 d=2 Bucket A 4* 24* 12* 16* d’=2 Bucket A 00 d’=2 01 14* Bucket A’ 00 1* 5* 01 10 Bucket B 11 d’=2 directory d=2 d’=2 1* 5* 10 Bucket B 11 15* 7* 19* d’=2 Bucket C directory 15* 7* 19* Bucket C september 2008 49 49 He Tan hetan@ida.liu.se IISLAB IDA september 2008 He Tan hetan@ida.liu.se IISLAB IDA Extendible Hashing 50 50 Extendible Hashing Insert 28 • Extend Æ double directory d’=3 d’=2 Bucket A d’=3 d=3 Bucket A 4* d’=2 01 d’=2 Bucket B d’=2 directory d’=2 010 14* 011 1* 5* 10 11 001 15* 7* 19* Bucket C d’=2 101 1* 5* 110 directory If removal of data entry makes bucket empty If each directory element points to the same bucket Bucket A’ 100 111 before insert, local depth of bucket = global depth. Insert causes local depth to become > global depth; • Shrink Æ half directory Bucket A’’ Bucket A’ 00 12* 28* 000 14* d=2 24* 16* 4* 24* 12* 16* • Gain: no performance degradation due to the collision • At the cost of: 2 block accesses per record (directory + data), space for directory, and bucket reorganization. Bucket B d’=2 15* 7* 19* Bucket C september 2008 51 51 He Tan hetan@ida.liu.se IISLAB IDA september 2008 He Tan hetan@ida.liu.se IISLAB IDA Summary 1 4 7 12 14 15 16 19 20 24 28 • Index on multiple keys • Hashing 53 TDDB38/TDDI60 - HT 2004 Extendible Hashing - example h(K) • Search trees, B+-trees september 2008 52 52 53 september 2008 54 h(K)2 00001 00100 00111 01100 01110 01111 10000 10011 10100 11000 11100 54 9