CS232A: Database System Principles INDEXING 1 Indexing Given condition on attribute find qualified records Qualified records value Attr = value ? value value Condition may also be • Attr>value • Attr>=value 2 Indexing • Data Stuctures used for quickly locating tuples that meet a specific type of condition – Equality condition: find Movie tuples where Director=X – Other conditions possible, eg, range conditions: find Employee tuples where Salary>40 AND Salary<50 • Many types of indexes. Evaluate them on – – – – Access time Insertion time Deletion time Disk Space needed (esp. as it effects access time) Topics • Conventional indexes • B-trees • Hashing schemes 4 Terms and Distinctions • Primary index – the index on the attribute (a.k.a. search key) that determines the sequencing of the table • Secondary index – index on any other attribute • Dense index – every value of the indexed attribute appears in the index • Sparse index – many values do not appear A Dense Primary Index 10 20 30 40 50 70 80 90 100 120 140 150 10 20 30 40 50 70 80 90 100 120 Sequential File Dense and Sparse Primary Indexes Dense Primary Index 10 20 30 40 50 70 80 90 100 120 140 150 10 20 30 40 50 70 80 90 100 120 + can tell if a value exists without accessing file (consider projection) + better access to overflow records Sparse Primary Index 10 30 50 80 100 140 160 200 10 20 30 40 50 70 80 90 100 120 Find the index record with largest value that is less or equal to the value we are looking. + less index space more + and - in a while Sparse vs. Dense Tradeoff • Sparse: Less index space per record can keep more of index in memory • Dense: Can tell if any record exists without accessing file (Later: – sparse better for insertions – dense needed for secondary indexes) 7 Multi-Level Indexes • Treat the index as a file and build an index on it • “Two levels are usually sufficient. More than three levels are rare.” • Q: Can we build a dense second level index for a dense index ? 10 100 250 400 600 750 920 1000 10 30 50 80 100 140 160 200 250 270 300 350 400 460 500 550 10 20 30 40 50 70 80 90 100 120 A Note on Pointers • Record pointers consist of block pointer and position of record in the block • Using the block pointer only saves space at no extra disk accesses cost Representation of Duplicate Values in Primary Indexes • Index may point to first instance of each value only 10 40 70 100 10 10 10 40 40 70 70 70 100 120 Deletion from Dense Index Delete 40, 80 • Deletion from dense primary index file with no duplicate values is handled in the same way with deletion from a sequential file • Q: What about deletion from dense primary index with duplicates Header Header 10 20 30 50 70 90 10 20 30 50 70 90 100 120 Lists of available entries Deletion from Sparse Index Delete 40 • if the deleted entry does not appear in the index do nothing 10 30 50 80 100 140 160 200 Header 10 20 30 50 70 80 90 100 120 Deletion from Sparse Index (cont’d) Delete 30 • if the deleted entry does not appear in the index do nothing • if the deleted entry appears in the index replace it with the next search-key value – comment: we could leave the deleted value in the index assuming that no part of the system may assume it still exists without checking the block 10 40 50 80 100 140 160 200 Header 10 20 40 50 70 80 90 100 120 Deletion from Sparse Index (cont’d) Delete 40, then 30 • if the deleted entry does not appear in the index do nothing • if the deleted entry appears in the index replace it with the next search-key value • unless the next search key value has its own index entry. In this case delete the entry Header 10 50 80 100 140 160 200 Header 10 20 50 70 80 90 100 120 Insertion in Sparse Index Insert 35 • if no new block is created then do nothing 10 30 50 80 100 140 160 200 Header 10 20 30 35 50 70 80 90 100 120 Insertion in Sparse Index Insert 15 • if no new block is created then do nothing • else create overflow record – Reorganize periodically – Could we claim space of next block? – How often do we reorganize and how much expensive it is? – B-trees offer convincing answers 10 30 50 80 100 140 160 200 Header 10 20 30 50 70 80 90 100 120 16 Secondary indexes File not sorted on secondary search key Sequence field 30 50 20 70 80 40 100 10 90 60 17 Secondary indexes Sequence field • Sparse index 30 20 80 100 90 ... 30 50 20 70 80 40 100 10 does not make sense! 90 60 18 Secondary indexes Sequence field • Dense index 10 50 90 ... sparse high level 10 20 30 40 30 50 50 60 70 ... 80 40 First level has to be dense, next levels are sparse (as usual) 20 70 100 10 90 60 19 Duplicate values & secondary indexes 20 10 20 40 10 40 10 40 30 40 20 Duplicate values & secondary indexes one option... Problem: excess overhead! • disk space • search time 10 10 10 20 20 30 40 40 40 40 ... 20 10 20 40 10 40 10 40 30 40 21 Duplicate values & secondary indexes another option: lists of pointers Problem: variable size records in index! 10 20 10 20 20 40 30 40 10 40 10 40 30 40 22 Duplicate values & secondary indexes 10 20 30 40 50 60 ... Yet another idea : Chain records with same key? 20 10 20 40 10 40 10 40 30 40 Problems: • Need to add fields to records, messes up maintenance • Need to follow chain to know records 23 Duplicate values & secondary indexes 20 10 10 20 30 40 20 40 10 40 50 60 ... 10 40 30 40 buckets 24 Why “bucket” idea is useful • Enables the processing of queries working with pointers only. • Very common technique in Information Retrieval Indexes Name: primary Dept: secondary Year: secondary Records EMP (name,dept,year,...) 25 Advantage of Buckets: Process Queries Using Pointers Only Find employees of the Toys dept with 4 years in the company SELECT Name FROM Employee WHERE Dept=“Toys” AND Year=4 Year Index Dept Index Toys PCs Pens Suits Aaron Helen Suits Pens 4 3 Jack Jim PCs Toys 4 4 Joe Nick Toys PCs 3 2 Walt Yannis Toys Pens 5 1 Intersect toy bucket and 2nd Floor bucket to get set of matching EMP’s 1 2 3 4 This idea used in text information retrieval cat dog Buckets known as Documents ...the cat is fat ... ...my cat and my dog like each other... ...Fido the dog ... Inverted lists 27 Information Retrieval (IR) Queries • Find articles with “cat” and “dog” – Intersect inverted lists • Find articles with “cat” or “dog” – Union inverted lists • Find articles with “cat” and not “dog” – Subtract list of dog pointers from list of cat pointers • Find articles with “cat” in title • Find articles with “cat” and “dog” within 5 words 28 Common technique: more info in inverted list cat Title 5 Author 10 Abstract 57 dog d1 Title Title d3 d2 100 12 29 Posting: an entry in inverted list. Represents occurrence of term in article Size of a list: 1 Rare words or mis-spellings 106 Common words (in postings) Size of a posting: 10-15 bits (compressed) 30 Vector space model w1 w2 w3 w4 w5 w6 w7 … DOC = <1 0 0 1 1 0 0 …> Query= <0 PRODUCT = 0 1 1 0 0 0 …> 1 + ……. = score 31 • Tricks to weigh scores + normalize e.g.: Match on common word not as useful as match on rare words... 32 Summary of Indexing So Far • Basic topics in conventional indexes – multiple levels – sparse/dense – duplicate keys and buckets – deletion/insertion similar to sequential files • Advantages – simple algorithms – index is sequential file • Disadvantages – eventually sequentiality is lost because of overflows, reorganizations are needed Example continuous Index 10 20 30 33 40 50 60 free space 70 80 90 (sequential) 39 31 35 36 32 38 34 overflow area (not sequential) 34 Outline: • Conventional indexes • B-Trees NEXT • Hashing schemes 35 • NEXT: Another type of index – Give up on sequentiality of index – Try to get “balance” 36 180 200 150 156 179 120 130 100 101 110 30 35 3 5 11 120 150 180 30 100 B+Tree Example n=3 Root 37 95 81 57 Sample non-leaf to keys to keys to keys < 57 57 k<81 81k<95 to keys 95 38 57 81 95 To record with key 57 To record with key 81 To record with key 85 Sample leaf node: From non-leaf node to next leaf in sequence 39 In textbook’s notation n=3 30 35 Leaf: 30 35 Non-leaf: 30 30 40 Size of nodes: n+1 pointers (fixed) n keys 41 Non-root nodes have to be at least half-full • Use at least Non-leaf: (n+1)/2 pointers Leaf: (n+1)/2 pointers to data 42 n=3 Non-leaf 120 150 180 30 Leaf 30 35 min. node 3 5 11 Full node 43 B+tree rules tree of order n (1) All leaves at same lowest level (balanced tree) (2) Pointers in leaves point to records except for “sequence pointer” 44 (3) Number of pointers/keys for B+tree Max Max Min ptrs keys ptrsdata Non-leaf (non-root) Leaf (non-root) Root Min keys n+1 n (n+1)/2 (n+1)/2- 1 n+1 n (n+1)/2 (n+1)/2 n+1 n 1 1 45 Insert into B+tree (a) simple case – space available in leaf (b) leaf overflow (c) non-leaf overflow (d) new root 46 n=3 30 31 32 3 5 11 30 100 (a) Insert key = 32 47 (a) Insert key = 7 30 30 31 3 57 11 3 5 7 100 n=3 48 180 200 160 179 150 156 179 180 120 150 180 160 100 (c) Insert key = 160 n=3 49 (d) New root, insert 45 40 45 40 30 32 40 20 25 10 12 1 2 3 10 20 30 30 new root n=3 50 Deletion from B+tree (a) Simple case - no example (b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) Cases (b) or (c) at non-leaf 51 (b) Coalesce with sibling n=4 40 50 10 20 30 40 10 40 100 – Delete 50 52 (c) Redistribute keys n=4 35 40 50 10 20 30 35 10 40 35 100 – Delete 50 53 (d) Non-leaf coalese n=4 25 – Delete 37 40 45 30 37 30 40 25 26 30 20 22 10 14 1 3 10 20 25 40 new root 54 B+tree deletions in practice – Often, coalescing is not implemented – Too hard and not worth it! 55 Comparison: B-trees vs. static indexed sequential file Ref #1: Held & Stonebraker “B-Trees Re-examined” CACM, Feb. 1978 56 Ref # 1 claims: - Concurrency control harder in B-Trees - B-tree consumes more space For their comparison: block = 512 bytes key = pointer = 4 bytes 4 data records per block 57 Example: 1 block static index k1 k1 127 keys 1 data block k2 k2 k3 (127+1)4 = 512 Bytes -> pointers in index implicit! k3 up to 127 contigous blocks 58 Example: 1 block B-tree k1 k1 1 data block k2 63 keys k2 ... k63 k3 next 63x(4+4)+8 = 512 Bytes -> pointers needed in B-tree blocks because index and data are not contiguous up to 63 blocks 59 Size comparison Static Index # data blocks height 2 -> 127 128 -> 16,129 16,130 -> 2,048,383 2 3 4 Ref. #1 B-tree # data blocks height 2 -> 63 64 -> 3968 3969 -> 250,047 250,048 -> 15,752,961 2 3 4 5 60 Ref. #1 analysis claims • For an 8,000 block file, after 32,000 inserts after 16,000 lookups Static index saves enough accesses to allow for reorganization Ref. #1 conclusion Static index better!! 61 Ref #2: M. Stonebraker, “Retrospective on a database system,” TODS, June 1980 Ref. #2 conclusion B-trees better!! 62 Ref. #2 conclusion B-trees better!! • DBA does not know when to reorganize – Self-administration is important target • DBA does not know how full to load pages of new index 63 Ref. #2 conclusion B-trees better!! • Buffering – B-tree: has fixed buffer requirements – Static index: large & variable size buffers needed due to overflow 64 • Speaking of buffering… Is LRU a good policy for B+tree buffers? Of course not! Should try to keep root in memory at all times (and perhaps some nodes from second level) 65 Interesting problem: For B+tree, how large should n be? … n is number of keys / node 66 Assumptions • You have the right to set the disk page size for the disk where a B-tree will reside. • Compute the optimum page size n assuming that – The items are 4 bytes long and the pointers are also 4 bytes long. – Time to read a node from disk is 12+.003n – Time to process a block in memory is unimportant – B+tree is full (I.e., every page has the maximum number of items and pointers Can get: f(n) = time to find a record f(n) nopt n 68 FIND nopt by f’(n) = 0 Answer should be nopt = “few hundred” What happens to nopt as • Disk gets faster? • CPU get faster? 69 Variation on B+tree: B-tree (no +) • Idea: – Avoid duplicate keys – Have record pointers in non-leaf nodes 70 K1 P1 to keys < K1 K2 P2 K3 P3 to record to record to record with K1 with K2 with K3 to keys to keys K1<x<K2 K2<x<k3 to keys >k3 71 170 180 150 160 145 165 110 120 90 100 70 80 50 60 30 40 10 20 25 45 85 105 (but keep space for simplicity) 130 140 • sequence pointers not useful now! n=2 65 125 B-tree example 72 So, for B-trees: MAX Non-leaf non-root Leaf non-root Root non-leaf Root Leaf MIN Tree Ptrs Rec Keys Ptrs Tree Ptrs Rec Ptrs Keys n+1 n n 1 n n 1 n+1 n n 2 1 1 1 n n 1 1 1 (n+1)/2 (n+1)/2-1 (n+1)/2-1 (n+1)/2 (n+1)/2 73 Tradeoffs: B-trees have marginally faster average lookup than B+trees (assuming the height does not change) in B-tree, non-leaf & leaf different sizes Smaller fan-out in B-tree, deletion more complicated B+trees preferred! 74 Example: - Pointers - Keys - Blocks - Look at full 4 bytes 4 bytes 100 bytes (just example) 2 level tree 75 B-tree: Root has 8 keys + 8 record pointers + 9 son pointers = 8x4 + 8x4 + 9x4 = 100 bytes Each of 9 sons: 12 rec. pointers (+12 keys) = 12x(4+4) + 4 = 100 bytes 2-level B-tree, Max # records = 12x9 + 8 = 116 76 B+tree: Root has 12 keys + 13 son pointers = 12x4 + 13x4 = 100 bytes Each of 13 sons: 12 rec. ptrs (+12 keys) = 12x(4 +4) + 4 = 100 bytes 2-level B+tree, Max # records = 13x12 = 156 77 So... 8 records B+ B ooooooooooooo ooooooooo 156 records 108 records Total = 116 • Conclusion: – For fixed block size, – B+ tree is better because it is bushier 78 Outline/summary • Conventional Indexes • Sparse vs. dense • Primary vs. secondary • B trees • B+trees vs. B-trees • B+trees vs. indexed sequential • Hashing schemes --> Next 79 Hashing Records • hash function h(key) returns address of bucket or record • for secondary index buckets are required • if the keys for a specific hash value do not fit into one page the bucket is a linked list of pages key h(key) Buckets key h(key) key Records Example hash function • Key = ‘x1 x2 … xn’ n byte character string • Have b buckets • h: add x1 + x2 + ….. xn – compute sum modulo b 81 This may not be best function … Read Knuth Vol. 3 if you really need to select a good function. Good hash function: Expected number of keys/bucket is the same for all buckets 82 Within a bucket: • Do we keep keys sorted? • Yes, if CPU time critical & Inserts/Deletes not too frequent 83 Next: example to illustrate inserts, overflows, deletes h(K) 84 EXAMPLE 2 records/bucket INSERT: h(a) = 1 h(b) = 2 h(c) = 1 h(d) = 0 0 d 1 a c 2 b e 3 h(e) = 1 85 EXAMPLE: deletion Delete: e f c 0 a 1 b c d e 2 3 f g d maybe move “g” up 86 Rule of thumb: • Try to keep space utilization between 50% and 80% Utilization = # keys used total # keys that fit • If < 50%, wasting space • If > 80%, overflows significant depends on how good hash function is & on # keys/bucket 87 How do we cope with growth? • Overflows and reorganizations • Dynamic hashing • Extensible • Linear 88 Extensible hashing: two ideas (a) Use i of b bits output by hash function b h(K) 00110101 use i grows over time…. 89 (b) Use directory h(K)[0-i ] . . . to bucket . . . 90 Example: h(k) is 4 bits; 2 keys/bucket i= 1 1 0001 00 01 1 2 1001 1010 1100 Insert 1010 i=2 1 2 1100 10 11 New directory 91 Example continued i= 2 00 01 10 11 Insert: 0111 2 0000 0001 1 2 0001 0111 0111 2 1001 1010 2 1100 0000 92 Example continued 0000 2 i= 2 0001 00 0111 2 01 10 11 Insert: 1001 1001 3 1001 1010 1001 2 3 1010 1100 2 i=3 000 001 010 011 100 101 110 111 93 Extensible hashing: deletion • No merging of blocks • Merge blocks and cut directory if possible (Reverse insert procedure) 94 Deletion example: • Run thru insert example in reverse! 95 Summary Extensible hashing + Can handle growing files - with less wasted space - with no full reorganizations - Indirection (Not bad if directory in memory) - Directory doubles in size (Now it fits, now it does not) 96 Linear hashing • Another dynamic hashing scheme Two ideas: (a) Use i low order bits of hash b 01110101 grows i (b) File grows linearly 97 Example b=4 bits, 0101 0000 1010 00 i =2, 2 keys/bucket • insert 0101 • can have overflow chains! 0101 1111 Future growth buckets 01 10 11 m = 01 (max used block) Rule If h(k)[i ] m, then look at bucket h(k)[i ] else, look at bucket h(k)[i ] - 2i -1 98 Example b=4 bits, 0101 0000 1010 00 0101 0101 1111 i =2, 2 keys/bucket • insert 0101 1010 1111 Future growth buckets 01 10 11 m = 01 (max used block) 10 11 99 Example Continued: How to grow beyond this? i=2 3 0000 0 00 100 0101 0101 0 01 101 1010 010 110 0101 0101 1111 0 11 111 100 101 ... m = 11 (max used block) 100 101 100 When do we expand file? • Keep track of: # used slots total # of slots =U • If U > threshold then increase m (and maybe i ) 101 Summary Linear Hashing + Can handle growing files - with less wasted space - with no full reorganizations + No indirection like extensible hashing - Can still have overflow chains 102 Example: BAD CASE Very full Very empty Need to move m here… Would waste space... 103 Summary Hashing - How it works - Dynamic hashing - Extensible - Linear 104 Next: • Indexing vs Hashing • Index definition in SQL • Multiple key access 105 Indexing vs Hashing • Hashing good for probes given key e.g., SELECT … FROM R WHERE R.A = 5 106 Indexing vs Hashing • INDEXING (Including B Trees) good for Range Searches: e.g., SELECT FROM R WHERE R.A > 5 107 Index definition in SQL • Create index name on rel (attr) • Create unique index name on rel (attr) defines candidate key • Drop INDEX name 108 Note CANNOT SPECIFY TYPE OF INDEX (e.g. B-tree, Hashing, …) OR PARAMETERS (e.g. Load Factor, Size of Hash,...) ... at least in SQL... 109 Note ATTRIBUTE LIST MULTIKEY INDEX (next) e.g., CREATE INDEX foo ON R(A,B,C) 110 Multi-key Index Motivation: Find records where DEPT = “Toy” AND SAL > 50k 111 Strategy I: • Use one index, say Dept. • Get all Dept = “Toy” records and check their salary I1 112 Strategy II: • Use 2 Indexes; Manipulate Pointers Toy Sal > 50k 113 Strategy III: • Multiple Key Index I2 One idea: I1 I3 114 Example Art Sales Toy Dept Index 10k 15k 17k 21k 12k 15k 15k 19k Example Record Name=Joe DEPT=Sales SAL=15k Salary Index 115 For which queries is this index good? Find Find Find Find RECs RECs RECs RECs Dept = “Sales” Dept = “Sales” Dept = “Sales” SAL = 20k SAL=20k SAL > 20k 116 Interesting application: • Geographic Data y DATA: ... x <X1,Y1, Attributes> <X2,Y2, Attributes> 117 Queries: • What city is at <Xi,Yi>? • What is within 5 miles from <Xi,Yi>? • Which is closest point to <Xi,Yi>? 118 Example 40 10 25 20 20 10 e h 30 20 15 35 i n o j k m 10 5 h i f 15 15 j k g l m d e c b f l a d c g 20 a b • Search points near f • Search points near b n o 119 Queries • • • • Find Find Find Find points points points points with Yi > 20 with Xi < 5 “close” to i = <12,38> “close” to b = <7,24> 120 • Many types of geographic index structures have been suggested • Quad Trees • R Trees 121 Two more types of multi key indexes • Grid • Partitioned hash 122 Grid Index Key 2 V1 V2 X1 X2 …… Xn Key 1 Vn To records with key1=V3, key2=X2 123 CLAIM • Can quickly find records with – key 1 = Vi Key 2 = Xj – key 1 = Vi – key 2 = Xj • And also ranges…. – E.g., key 1 Vi key 2 < Xj 124 But there is a catch with Grid Indexes! V2 V3 X1 X2 X3 X4 X1 X2 X3 X4 Like Array... V1 X1 X2 X3 X4 • How is Grid Index stored on disk? Problem: • Need regularity so we can compute position of <Vi,Xj> entry 125 Solution: Use Indirection V1 V2 V3 V4 X1 X2 X3 Buckets ---- ------------- Buckets *Grid only contains pointers to buckets 126 With indirection: • Grid can be regular without wasting space • We do have price of indirection 127 Can also index grid on value ranges Salary Grid 1 2 50K- 3 8 0-20K 20K-50K 1 Linear Scale 2 3 Toy Sales Personnel 128 Grid files + - Good for multiple-key search Space, management overhead (nothing is free) - Need partitioning ranges that evenly split keys 129 Partitioned hash function Idea: Key1 010110 1110010 h1 h2 Key2 130 EX: h1(toy) h1(sales) h1(art) . . h2(10k) h2(20k) h2(30k) h2(40k) . . Insert =0 =1 =1 =01 =11 =01 =00 000 001 010 011 100 101 110 111 <Fred> <Joe><Sally> <Fred,toy,10k>,<Joe,sales,10k> <Sally,art,30k> 131 h1(toy) =0 000 <Fred> h1(sales) =1 001 <Joe><Jan> h1(art) =1 010 <Mary> . 011 . <Sally> h2(10k) =01 100 h2(20k) =11 101 <Tom><Bill> h2(30k) =01 110 <Andy> h2(40k) =00 111 . . • Find Emp. with Dept. = Sales Sal=40k 132 h1(toy) =0 000 h1(sales) =1 001 h1(art) =1 010 . 011 . h2(10k) =01 100 h2(20k) =11 101 h2(30k) =01 110 h2(40k) =00 111 . . • Find Emp. with Sal=30k <Fred> <Joe><Jan> <Mary> <Sally> <Tom><Bill> <Andy> look here 133 h1(toy) =0 000 <Fred> h1(sales) =1 001 <Joe><Jan> h1(art) =1 010 <Mary> . 011 . <Sally> h2(10k) =01 100 h2(20k) =11 101 <Tom><Bill> h2(30k) =01 110 <Andy> h2(40k) =00 111 . . • Find Emp. with Dept. = Sales look here 134 Summary Post hashing discussion: - Indexing vs. Hashing - SQL Index Definition - Multiple Key Access - Multi Key Index Variations: Grid, Geo Data - Partitioned Hash 135 Reading Chapter 5 • Skim the following sections: – 5.3.6, 5.3.7, 5.3.8 – 5.4.2, 5.4.3, 5.4.4 • Read the rest 136 The BIG picture…. • Chapters 2 & 3: Storage, records, blocks... • Chapter 4 & 5: Access Mechanisms - Indexes - B trees - Hashing - Multi key • Chapter 6 & 7: Query Processing NEXT 137