Goal of Indexing Given condition(s) on attribute(s) find qualified records CSE 562 Database Systems Qualified records Indexing Attr = value Some slides are based or modified from originals by Database Systems: The Complete Book, Pearson Prentice Hall 2nd Edition ©2008 Garcia-Molina, Ullman, and Widom UB CSE UB CSE562 562 ? value value value Condition may also be • Attr>value • Attr>=value 1 Indexes (or Indices) 2 UB CSE 562 Topics • Data Structures used for quickly locating tuples that meet a specific type of condition • Conventional indexes • B-Trees • Hashing schemes – Equality condition: Find Movie tuples where Director=X – Range conditions: Find Employee tuples where Salary>40 AND Salary<50 • Many types of indexes. Evaluate them on: – Access time – Insertion/Deletion time – Condition types – Disk Space needed UB CSE 562 3 UB CSE 562 4 1 Sequential File 10 20 Tuples are sorted by their primary key Dense Index Index file needs much fewer blocks than the data file, hence easier to fit in memory Block 30 40 50 60 70 80 For a given key K, only log2n, out of n, index blocks need to be accessed 90 100 5 UB CSE 562 Sparse Index Typically, only one key per data block Find the index record with largest value that is less or equal to the value we are looking UB CSE 562 10 30 50 70 90 110 130 150 170 190 210 230 30 40 50 60 70 80 50 60 90 100 110 120 70 80 90 100 6 Sparse 2nd level 10 20 10 90 170 250 30 40 50 60 330 410 490 570 70 80 • Two levels are usually sufficient • More than three levels are rare 7 10 20 10 20 30 40 UB CSE 562 Sequential File 90 100 Sequential File UB CSE 562 Treat the index as a file and build an index on it 10 30 50 70 90 110 130 150 170 190 210 230 Sequential File 10 20 30 40 50 60 70 80 90 100 8 2 • Comment: {FILE,INDEX} may be contiguous or not (blocks chained) Question: • Can we build a dense, 2nd level index for a dense index? 9 UB CSE 562 Notes on Pointers UB CSE 562 10 Notes on Pointers • Record pointers consist of block pointer and position of record in the block • Using the block pointer only saves space at no extra disk accesses cost • Block pointer (sparse index) can be smaller than record pointer • If file is contiguous, then we can omit pointers (i.e., compute them) BP RP UB CSE 562 11 UB CSE 562 12 3 Sparse vs. Dense Tradeoff R1 K1 R2 K2 K3 R3 • Sparse: Less index space per record can keep more of index in memory • Dense: Can tell if any record exists without accessing file say: 1024 B per block K4 R4 (Later: – sparse better for insertions – dense needed for secondary indexes) • if we want K3 block: get it at offset (3-1)1024 = 2048 bytes 13 UB CSE 562 Terms 14 UB CSE 562 Next • Index sequential file • Search key ( ≠ primary key) • Primary index (on Sequencing field) • Duplicate keys – The index on the attribute (a.k.a. search key) that determines the sequencing of the table • Deletion/Insertion • Secondary index – Index on any other attribute • Dense index (all Search Key values in) • Sparse index • Multi-level index UB CSE 562 • Secondary indexes 15 UB CSE 562 16 4 Duplicate Keys Duplicate Keys Dense index, one way to implement? 10 10 10 10 10 20 10 20 20 30 20 30 30 30 30 30 40 45 Duplicate Keys 30 30 18 UB CSE 562 Sparse index, one way? careful if looking for 20 or 30! 10 10 10 20 20 30 30 30 40 45 UB CSE 562 20 30 Duplicate Keys Dense index, better way? 10 20 30 40 10 20 40 45 17 UB CSE 562 10 10 10 10 20 30 10 10 10 20 20 30 30 30 40 45 19 UB CSE 562 20 5 Duplicate Keys Summary Sparse index, another way? – place first new key from block should this be 40? 10 20 30 30 Duplicate values, primary index • Index may point to first instance of each value only File Index a 10 10 10 20 20 30 a 30 30 . . b 40 45 21 UB CSE 562 a Deletion from sparse index 22 UB CSE 562 Deletion from sparse index – delete record 40 10 30 50 70 90 110 130 150 UB CSE 562 10 20 10 30 50 70 30 40 50 60 90 If the deleted 110 entry does not130 appear in the 150 index do nothing 70 80 23 UB CSE 562 10 20 30 40 50 60 70 80 24 6 Deletion from sparse index Deletion from sparse index – delete record 30 40 10 30 50 70 90 If the deleted 110 entry appears 130 in the index replace 150 it with the next search-key value – delete records 30 & 40 10 20 30 40 40 50 60 30 40 50 60 90 If the next search 110 key value has 130 its own index entry, 150 then delete the entry 70 80 25 UB CSE 562 10 20 10 50 30 70 50 70 Deletion from dense index 70 80 26 UB CSE 562 Deletion from dense index – delete record 30 10 20 30 40 50 60 70 80 UB CSE 562 10 20 30 40 40 50 60 10 20 30 40 50 Deletion from 60 dense primary index 70 file is handled in the80 same way with deletion from a sequential file 70 80 27 UB CSE 562 10 20 30 40 40 50 60 70 80 Q: What about deletion from dense primary index with duplicates? 28 7 Insertion, sparse index case Insertion, sparse index case – insert record 34 10 30 40 60 10 20 10 20 10 30 40 60 30 30 34 40 50 40 50 60 • our lucky day! we have free space where we need it! 29 UB CSE 562 Insertion, sparse index case 20 30 UB CSE 562 Insertion, sparse index case – insert record 15 10 30 40 60 – insert record 25 10 20 15 10 30 40 60 30 20 30 10 20 30 40 50 40 50 • Illustrated: Immediate reorganization 60 • Variation: 60 – insert new block (chained file) – update index UB CSE 562 60 25 overflow blocks (reorganize later...) • How often do we reorganize and how expensive is it? B-Trees offer convincing answers 31 UB CSE 562 32 8 Secondary indexes Insertion, dense index case Sequence field 30 50 • Similar File not sorted on secondary search key • Often more expensive . . . 20 70 80 40 100 10 90 60 33 UB CSE 562 Secondary indexes 30 20 80 100 90 ... Secondary indexes Sequence field • Sparse index UB CSE 562 Sequence field • Dense index 30 50 10 50 90 ... 20 70 80 40 sparse high level 100 10 does not make sense! 34 UB CSE 562 90 60 35 UB CSE 562 10 20 30 40 30 50 50 60 70 ... 80 40 20 70 100 10 90 60 36 9 Duplicate values & secondary indexes With secondary indexes: 20 10 • Lowest level is dense • Other levels are sparse 20 40 10 40 Also: Pointers are record pointers 10 40 (not block pointers; not computed) 30 40 37 UB CSE 562 Duplicate values & secondary indexes one option... Problem: excess overhead! • disk space • search time UB CSE 562 10 10 10 20 20 30 40 40 40 40 ... 38 UB CSE 562 Duplicate values & secondary indexes another option... 20 10 20 40 Problem: variable size records in index! 10 40 10 40 30 40 10 20 10 20 20 40 30 40 10 40 10 40 30 40 39 UB CSE 562 40 10 Duplicate values & secondary indexes 10 20 30 40 Another idea: Chain records with same 50 key? 60 ... 20 10 Duplicate values & secondary indexes λ 20 40 10 40 10 40 30 40 20 40 10 40 50 60 ... λ 10 40 λ 30 40 λ Problems: • Need to add fields to records, messes up maintenance • Need to follow chain to know records buckets 41 UB CSE 562 42 UB CSE 562 Advantage of Buckets: Process Queries Using Pointers Only Why “bucket” idea is useful Indexes Name: primary Dept: secondary Floor: secondary 20 10 10 20 30 40 Find employees in Toys dept on the 4th floor: SELECT Name FROM Employee WHERE Dept=“Toys” AND Floor=4 Records EMP (name, dept, floor, ...) Dept Index Floor Index • Enables the processing of queries working with pointers only • Very common technique in Information Retrieval Intersect Toys bucket and 4th floor bucket to get set of matching EMP’s UB CSE 562 43 UB CSE 562 44 11 This idea used in text information retrieval IR QUERIES • Find articles with “cat” and “dog” Documents cat – Intersect inverted lists • Find articles with “cat” or “dog” ...the cat is fat ... dog – Union inverted lists • Find articles with “cat” and not “dog” ...was raining cats and dogs... – Subtract list of dog pointers from list of cat pointers • Find articles with “cat” in title • Find articles with “cat” and “dog” within 5 words ...Fido the dog ... Buckets known as Inverted lists 45 UB CSE 562 Posting: an entry in inverted list. Represents occurrence of term in article Common technique: more info in inverted list cat n n itio atio e s p c o ty p lo Abstract 57 dog UB CSE 562 Title 100 Title 12 Size of a list: d1 Title 5 Author 10 d3 46 UB CSE 562 1 Rare words or miss-spellings 106 Common words (in postings) d2 Size of a posting: 10-15 bits 47 UB CSE 562 (compressed) 48 12 IR DISCUSSION Vector space model • Stop words • Truncation • Thesaurus • Full text vs. Abstracts • Vector model w1 w2 w3 w4 w5 w6 w7 … DOC = <1 0 0 1 1 0 0 …> UB CSE 562 Query= <0 0 1 PRODUCT = 49 1 0 0 0 …> 1 + ……. = score 50 UB CSE 562 Summary of Indexing So Far • Tricks to weigh scores + normalize • Conventional index – Basic Ideas: sparse, dense, multi-level… – Duplicate Keys – Deletion/Insertion – Secondary indexes e.g.: Match on common word not as useful as match on rare words... • Buckets of Postings List UB CSE 562 51 UB CSE 562 52 13 Conventional Indexes Example Advantage: - Simple algorithms - Index is sequential file good for scans continuous Disadvantage: - Inserts expensive, and/or - Lose sequentiality, reorganizations are needed (sequential) 10 20 30 33 40 50 60 39 31 35 36 32 38 34 free space 70 80 90 53 UB CSE 562 Index overflow area (not sequential) 54 UB CSE 562 Topics • NEXT: Another type of index • Conventional indexes • B-Trees • Hashing schemes UB CSE 562 – Give up on sequentiality of index – Try to get “balance” ⇒ NEXT 55 UB CSE 562 56 14 B+Tree Example n=3 Sample non-leaf 180 200 150 156 179 120 130 100 101 110 30 35 3 5 11 30 95 81 120 150 180 57 100 Root 57 UB CSE 562 to keys to keys < 57 57≤ k<81 81≤k<95 From non-leaf node 95 ≥95 58 In textbook’s notation 81 to keys UB CSE 562 Sample leaf node: 57 to keys n=3 Leaf: 30 35 to next leaf in sequence 30 35 UB CSE 562 To record with key 85 30 30 To record with key 81 To record with key 57 Non-leaf: 59 UB CSE 562 60 15 Don’t want nodes to be too empty Size of nodes: n+1 pointers (fixed) n keys • Non-root nodes have to be at least half-full • Use at least 61 UB CSE 562 ⎡(n+1)/2⎤ pointers Leaf: ⎣(n+1)/2⎦ pointers to data 62 UB CSE 562 n=3 B+Tree rules tree of order n 30 Leaf 30 35 (1) All leaves at same lowest level (balanced tree) (2) Pointers in leaves point to records except for “sequence pointer” counts even if null Non-leaf 120 150 180 min. node 3 5 11 Full node UB CSE 562 Non-leaf: 63 UB CSE 562 64 16 Insert into B+Tree (3) Number of pointers/keys for B+Tree Max Max Min ptrs keys ptrs→data Non-leaf (non-root) Leaf (non-root) Root (a) simple case Min keys – space available in leaf n+1 n ⎡(n+1)/2⎤ ⎡(n+1)/2⎤- 1 n+1 n ⎣(n+1)/2⎦ ⎣(n+1)/2⎦ n+1 n 1 1 (b) leaf overflow (c) non-leaf overflow (d) new root Counting sequence pointer also 65 (a) Insert key = 7 n=3 100 n=3 UB CSE 562 UB CSE 562 30 7 67 3 57 11 3 5 30 31 32 3 5 11 30 100 (a) Insert key = 32 66 UB CSE 562 30 31 UB CSE 562 68 17 (d) New root, insert 45 70 UB CSE 562 (b) Coalesce with sibling Deletion from B+Tree n=4 – Delete 50 71 UB CSE 562 40 50 10 20 30 40 10 40 100 (a) Simple case - no example (b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) Cases (b) or (c) at non-leaf UB CSE 562 40 45 30 32 40 20 25 10 20 30 10 12 1 2 3 180 200 160 179 180 120 150 180 150 156 179 69 UB CSE 562 30 new root n=3 40 n=3 160 100 (c) Insert key = 160 72 18 (d) Non-leaf coalese n=4 25 40 45 30 37 25 26 30 20 22 10 14 1 3 35 40 50 10 20 30 35 10 20 25 40 new root 73 UB CSE 562 74 UB CSE 562 Comparison: B-Trees vs. static indexed sequential file B+Tree deletions in practice – Often, coalescing is not implemented Ref #1: Held & Stonebraker “B-Trees Re-examined” CACM, Feb. 1978 – Too hard and not worth it! UB CSE 562 n=4 – Delete 37 10 40 35 100 – Delete 50 30 40 (c) Redistribute keys 75 UB CSE 562 76 19 Example: 1 block static index Ref # 1 claims: - Concurrency control harder in B-Trees - B-Tree consumes more space k1 k1 For their comparison: block = 512 bytes key = pointer = 4 bytes 4 data records per block 127 keys k2 k2 k3 k3 (127+1)4 = 512 Bytes -> pointers in index implicit! 77 UB CSE 562 1 data block up to 127 blocks 78 UB CSE 562 Example: 1 block B-Tree k1 k1 k2 63 keys Static Index # data blocks height 1 data block k2 ... k63 k3 - 2 -> 127 128 -> 16,129 16,130 -> 2,048,383 next 63x(4+4)+8 = 512 Bytes -> pointers needed in B-Tree blocks because index is not contiguous UB CSE 562 2 3 4 B-Tree # data blocks height 2 -> 63 64 -> 3,968 3,969 -> 250,047 250,048 -> 15,752,961 2 3 4 5 up to 63 blocks 79 UB CSE 562 80 20 Ref. #1 analysis claims Ref #2: M. Stonebraker, “Retrospective on a database system,” TODS, June 1980 • For an 8,000 block file, after 32,000 inserts after 16,000 lookups ⇒ Static index saves enough accesses to allow for reorganization Ref. #1 conclusion 81 82 UB CSE 562 B-Trees better!! Ref. #2 conclusion • DBA does not know when to reorganize • DBA does not know how full to load pages of new index UB CSE 562 B-Trees better!! Static index better!! UB CSE 562 Ref. #2 conclusion Ref. #2 conclusion B-Trees better!! • Buffering – B-Tree: has fixed buffer requirements – Static index: must read several overflow blocks to be efficient (large & variable size buffers needed for this) 83 UB CSE 562 84 21 • Speaking of buffering… Is LRU a good policy for B+Tree buffers? Interesting problem: For B+Tree, how large should n be? → Of course not! → Should try to keep root in memory at all times … (and perhaps some nodes from second level) n is number of keys / node 85 UB CSE 562 86 UB CSE 562 ➸Can get: Sample assumptions: f(n) = time to find a record (1) Time to read node from disk is (S+Tn) msec. (2) Once block in memory, use binary search to locate key: (a + b LOG2 n) msec. f(n) For some constants a,b; Assume a << S (3) Assume B+Tree is full, i.e., # nodes to examine is LOGn N where N = # records UB CSE 562 nopt 87 UB CSE 562 n 88 22 ➸ FIND nopt by f’(n) = 0 Variation on B+Tree: B-Tree (no +) Answer should be nopt = “few hundred” • Idea: – Avoid duplicate keys – Have record pointers in non-leaf nodes ➸ What happens to nopt as • Disk gets faster? • CPU get faster? 89 UB CSE 562 90 UB CSE 562 (but keep space for simplicity) 25 45 UB CSE 562 91 UB CSE 562 170 180 150 160 130 140 110 120 90 100 70 80 to keys >k3 50 60 to record to record to record with K1 with K2 with K3 to keys to keys K1<x<K2 K2<x<k3 85 105 K3 P3 10 20 to keys < K1 K2 P2 30 40 K1 P1 145 165 • sequence pointers not useful now! n=2 65 125 B-Tree example 92 23 Note on inserts So, for B-Trees: • Say we insert record with key = 25 Non-leaf non-root Leaf non-root Root non-leaf Root Leaf 25 30 10 – 20 – • Afterwards: 10 20 30 leaf MAX n=3 93 UB CSE 562 Tradeoffs MIN Tree Ptrs Rec Keys Ptrs Tree Ptrs Rec Ptrs Keys n+1 n n 1 n n 1 n+1 n n 2 1 1 1 n n 1 1 1 ⎡(n+1)/2⎤ ⎡(n+1)/2⎤-1 ⎡(n+1)/2⎤-1 ⎣(n+1)/2⎦ ⎣(n+1)/2⎦ 94 UB CSE 562 But note: • If blocks are fixed size B-Trees have faster lookup than B+Trees (due to disk and buffering restrictions) Then lookup for B+Tree is actually better!! in B-Tree, non-leaf & leaf different sizes in B-Tree, smaller fan-out in B-Tree, deletion more complicated ➨ B+Trees preferred! UB CSE 562 95 UB CSE 562 96 24 Example: - Pointers - Keys - Blocks - Look at full B-Tree: Root has 8 keys + 8 record pointers + 9 son pointers = 8x4 + 8x4 + 9x4 = 100 bytes 4 bytes 4 bytes 100 bytes (just example) 2 level tree Each of 9 sons: 12 rec. pointers (+12 keys) = 12x(4+4) + 4 = 100 bytes 2-level B-Tree, Max # records = 12x9 + 8 = 116 UB CSE 562 97 B+Tree: So... Root has 12 keys + 13 son pointers = 12x4 + 13x4 = 100 bytes 8 records B+ Each of 13 sons: 12 rec. ptrs (+12 keys) = 12x(4 +4) + 4 = 100 bytes 2-level B+Tree, Max # records = 13x12 = 156 UB CSE 562 98 UB CSE 562 B ooooooooooooo ooooooooo 156 records 108 records Total = 116 • Conclusion: – For fixed block size, – B+Tree is better because it is bushier 99 UB CSE 562 100 25 An Interesting Problem... One Solution: Multiple Indexes • What is a good index structure when: • Example: I1, I2 – records tend to be inserted with keys that are larger than existing values? day days indexed I1 1,2,3,4,5 11,2,3,4,5 11,12,3,4,5 11,12,13,4,5 (e.g., banking records with growing data/time) 10 11 12 13 – we want to remove older data days indexed I2 6,7,8,9,10 6,7,8,9,10 6,7,8,9,10 6,7,8,9,10 • advantage: deletions/insertions from smaller index • disadvantage: query multiple indexes 101 UB CSE 562 Another Solution (Wave Indexes) day 10 11 12 13 14 15 16 I1 1,2,3 1,2,3 1,2,3 13 13,14 13,14,15 13,14,15 I2 4,5,6 4,5,6 4,5,6 4,5,6 4,5,6 4,5,6 16 I3 7,8,9 7,8,9 7,8,9 7,8,9 7,8,9 7,8,9 7,8,9 This Time I4 10 10,11 10,11, 10,11, 10,11, 10,11, 10,11, • Conventional Indexes – Chapter 14: 14.1 12 12 12 12 12 • Sparse vs. dense • Primary vs. secondary • B-Trees – Chapter 14: 14.2 • B+Trees vs. B-Trees vs. indexed sequential • advantage: no deletions • disadvantage: approximate windows UB CSE 562 102 UB CSE 562 103 UB CSE 562 104 26