CS 440 Database Management Systems Lecture 6: Data storage & access methods 1 Database System Implementation User Requirements Conceptual Design Entity Relationship(ER) Model Schema Physical Storage Relational Model Files and Indexes 2 The advantage of RDBMS • It separates logical level (schema) from physical level (implementation). • Physical data independence – Users do not worry about how their data is stored and processes on the physical devices. – It is all SQL! – Their queries work over (almost) all RDBMS deployments. 3 DBMS Architecture User/Web Forms/Applications/DBA query transaction Query Parser Transaction Manager Process manager Query Rewriter Query Optimizer Lock Manager Logging & Recovery Query Executor Files & Access Methods Buffer Manager Buffers Lock Tables Main Memory Storage Manager Storage 4 Challenges in physical level • • • • Processor: 10000 – 100000 MIPS Main memory: around 10 Gb/ sec. Secondary storage: higher capacity and durability Disk random access – Seek time + rotational latency + transfer time – Seek time: 4 ms - 15 ms! – Rotational latency: 2 ms – 7 ms! – Transfer time: at most 1000 Mb/ sec – Read, write in blocks. 5 Gloomy future: Moor’s law • Speed of processors and cost and maximum capacity of storage increase exponentially over time. • But storage (main and secondary) access time grows much more slowly. 6 Random access versus sequential access • Disk random access : Seek time + rotational latency + transfer time. • Disk sequential access: reading blocks next to each other – No seek time or rotational latency – Much faster than random access 7 Units of data on physical device • • • • Fields: data items Records Blocks Files 8 Fields • Fixed size – Integer, Boolean, … • Variable length – Varchar, … – Null terminated – Size at the beginning of the string 9 Records: sets of fields • Schema – Number of fields, types of fields, order, … • Fixed format and length – Record holds only the data items • Variable format and length – Record holds fields and their size, type, … information • Range of formats in between 10 Record header • • • • Pointer to the record schema ( record type) Record size Timestamp Other info … 11 Blocks • Collection of records • Reduces number of I/O access • Different from OS blocks – Why should RDBMS manage its own blocks? • It knows the access pattern better than OS. • Separating records in a block – Fixed size records: no worry! – Markers between records – Keep record size information in records or block header. 12 Spanned versus un-spanned • Un-spanned – Each records belongs to only one block • Spanned – Records may be stored across multiple blocks – Saves space – The only way to deal with large records and fields: blob, image, … 13 Block header • • • • • • • Data about block File, relation, DB IDs Block ID and type Record directory Pointer to free space Timestamp Other info … 14 Heap versus sorted files • Heap files – There is not any order in the file – New blocks are inserted at the end of the file. • Sorted files – Order blocks (and records) based on some key. – Physically contiguous or using links to the next blocks. 15 Average cost of data operations • Insertion – Heap files are more efficient. – Overflow areas for sorted files. • Search for a record or a range of records – Sorted files are more efficient. • Deletion – Heap files are more efficient – Although we find the record faster in the sorted file. 16 Row versus column stores • We have talked about row store – All fields of a record are stored together. SSN1 SSN2 SSN3 Name1 Name2 Name3 Age1 Age2 Age3 Salary1 Salary2 Salary3 17 Row versus column stores • We can store the fields in columns. – We can store SSNs implicitly. SSN1 SSN2 SSN3 Name1 Name2 Name3 SSN1 SSN2 SSN3 SSN1 SSN2 SSN3 Salary1 Salary2 Salary3 Age1 Age2 Age3 18 Row versus column store • Column store – Compact storage – Faster reads on data analysis and mining operations • Row store – Faster writes – Faster reads for record access (OLTP) • Further reading – Mike Stonebreaker, et al, “C-Store, a column oriented DBMS”, VLDB’05. 19 Access paths • The methods that RDBMS uses to retrieve the data. • Attribute value(s) Tuple(s) 20 Types of search queries • Point query over Beers(name, manf) Select * From Beers Where name = ‘Bud’; • Range query over Sells(bar, beer, price) Select * From Sells Where price > 2 AND price < 10; 21 Types of access paths • Full table scan – Heap files – Inefficient for both point and range queries. • Sequential access – Sorted files – Efficient for both point and range queries. – Inefficient to maintain • Middle ground? 22 Indexing • An old idea 23 Index • A data structure that speeds up selecting tuples in a relation based on some search keys. • Search key – A subset of the attributes in a relation – May not be the same as the (primary) key • Entries in an index – (k, r) – k is the search key. – r is the pointer to a record (record id). 24 Index • Data file stores the table data. • Index file stores the index data structure. Index File Data File 10 10 20 20 30 40 30 40 50 60 50 70 80 60 • Index file is smaller than the data file. • Ideally, the index should fit in the main memory. 25 Index categorizations • Clustered vs. unclustered – Records are stored according to the index order. – Records are stored in another order, or not any order. • Dense vs. sparse – Each record is pointed by an entry in the index. – Each block has an entry in the index. – Size versus time tradeoff. • Primary vs. secondary – Primary key is the search key – Other attributes. 26 Index categorizations • Clustered and dense INDEX DATA 10 10 20 20 30 40 30 40 50 60 50 70 80 60 27 Index categorizations • Clustered and sparse INDEX DATA 10 10 30 20 50 70 30 40 90 110 50 60 70 80 28 Duplicate search keys • Clustered and dense INDEX DATA 10 10 20 10 30 40 10 20 50 60 20 30 40 50 29 Duplicate search keys • Clustered and sparse: INDEX DATA 10 10 10 10 20 40 10 20 50 60 20 30 40 – Any problem? 50 30 Duplicate search keys • Clustered and sparse: – Point to the lowest new search key in every block INDEX DATA 10 10 20 10 30 40 10 20 50 20 30 40 50 31 Unclustered Index • Dense / sparse? INDEX DATA 10 30 10 10 10 20 20 30 20 30 10 30 40 20 10 40 32 Well known index structures • B+ trees: – very popular • Hash tables: – Not frequently used 33 B+ trees • The index of a very large data file gets too large. • How about building an index for the index file? • A multi-level index, or a tree 34 B+ trees • Degree (order) of the tree: d • Each node (except root) stores [d, 2d] keys: Non-leaf nodes [A , 10) 32 [10, 32) Leaf nodes Records 10 12 12 28 28 94 [32, 94) 32 [94, B) 39 41 65 32 35 Example d=2 60 19 12 12 13 13 17 50 80 19 17 19 21 21 30 30 40 50 40 50 90 52 110 60 52 60 65 65 72 72 36 B+ tree tuning • How to choose the value of d? – Each node should fit in a block. • Example – Key value: 8 byte – Record pointer: 16 bytes – Block size: 4096 bytes – 2d * 8 + (2d + 1) * 16 <= 4096 – d <= 85 37 Retrieving tuples using B+ tree • Point queries – Start from the root and follow the links to the leaf. • Range queries – Find the lowest point in the range. – Then, follow the links between the nodes. • The top levels are kept in the buffer pool. 38 B+ tree and index categories • B+ tree index could be – Dense / sparse? – Clustered/ unclustered? 39 Inserting a new key • Pick the proper leaf node and insert the key. • If the node contains more than 2d keys, split the node and insert the extra node in the parent. (K3, K1 R0 K2 R1 K3 R2 K4 R3 K5 R4 R5 K1 R0 K2 R1 ) parent K4 R2 R3 K5 R4 R5 – If leaf level, add K3 to the right node 40 Insertion Insert K = 18 60 19 12 12 13 13 17 50 80 19 17 19 21 21 30 30 40 50 40 50 90 52 110 60 52 60 65 65 72 72 41 Insertion Insert K = 18 60 19 12 12 13 13 17 17 50 18 18 80 19 19 21 21 30 30 40 50 40 50 90 52 110 60 52 60 65 65 72 72 42 Insertion Insert K= 20 60 19 12 12 13 13 17 17 18 50 18 80 19 19 20 20 21 21 30 40 30 40 50 50 90 110 52 52 60 60 65 65 72 72 43 Insertion Need to split the node 60 19 12 12 13 13 17 17 50 18 18 80 19 19 20 20 21 21 30 40 30 40 50 50 90 110 52 52 60 60 65 65 72 72 44 Insertion Split and update the parent node. What if we need to split the root? 60 19 12 12 21 13 17 18 19 13 17 18 19 50 20 20 80 21 21 30 30 40 40 90 50 50 110 52 52 60 60 65 65 72 72 45 Deletion Delete K = 21 60 19 12 12 21 13 17 18 19 13 17 18 19 50 20 20 80 21 21 30 30 40 40 90 50 50 110 52 52 60 60 65 65 72 72 46 Deletion Note: K = 21 may still remain in the internal levels 60 19 12 12 21 13 17 18 19 13 17 18 19 50 20 20 80 30 30 40 40 90 50 50 110 52 52 60 60 65 65 72 72 47 Deletion Delete K = 20 60 19 12 12 21 13 17 18 19 13 17 18 19 50 20 20 80 30 30 40 40 90 50 50 110 52 52 60 60 65 65 72 72 48 Deletion We need to update the number of keys on the node: Borrow from siblings: rotate 60 19 12 12 21 13 17 18 19 13 17 18 19 50 80 30 30 40 40 90 50 50 110 52 52 60 60 65 65 72 72 49 Deletion We need to update the number of keys on the node: Borrow from siblings: rotate 60 19 12 12 13 17 13 17 21 18 18 19 19 50 80 30 30 40 40 90 50 50 110 52 52 60 60 65 65 72 72 50 Deletion We need to update the number of keys on the node: Borrow from siblings: rotate 60 18 12 12 13 17 13 17 21 18 18 19 19 50 80 30 30 40 40 90 50 50 110 52 52 60 60 65 65 72 72 51 Deletion What if we cannot borrow from siblings? Example: delete K = 30 60 18 12 12 13 17 13 17 21 18 18 19 19 50 80 30 30 40 40 90 50 50 110 52 52 60 60 65 65 72 72 52 Deletion What if we cannot borrow from siblings? Merge with a sibling. 60 18 12 12 13 17 13 17 21 18 18 19 19 50 80 40 40 90 50 50 110 52 52 60 60 65 65 72 72 53 Deletion What if we cannot borrow from siblings? Merge siblings! 60 18 12 12 13 17 13 17 21 18 18 19 19 50 40 40 80 90 50 50 110 52 52 60 60 65 65 72 72 54 Deletion What to do with the dangling key and pointer? simply remove them 60 18 12 12 13 17 13 17 21 18 18 19 19 50 40 40 80 90 50 50 110 52 52 60 60 65 65 72 72 55 Deletion Final tree 60 18 12 12 13 17 13 17 50 18 18 19 19 80 40 40 90 50 50 110 52 52 60 60 65 65 72 72 56 Index creation CREATE TABLE Person(Name varchar(50), Pos int, Age int); CREATE INDEX Person_ID ON Person(ID); Default is normally B-tree. CLUSTER Person USING ON Person_ID; Cluster Person_ID index CREATE INDEX Pos_Age ON Person(Pos, Multi-attribute index Age); 57 Index selection • Let’s index every attribute on every table to speed up all queries! • Indexes generally slow down data manipulation – INSERT, DELETE, UPDATE. 58 Index selection • Given a query workload and a schema, find the set of indexes that optimize the execution. • The query workload: – Queries and their frequencies. – Queries are both data retrieval (SELECT) and data manipulation (INSERT, UPDATE, DELETE). 59 Index selection • Part of physical database design – File structure, indexing, tuning queries,… • Physical database design may affect logical design! – Change the schema to run the queries faster 60 Index selection • Generally a hard problem. • RDBMS vendors provide wizards: – Started with AutoAdmin project for SQL Server – SQL Server/ Oracle Index Tuning Wizard – DB2 Index Advisor • They try many configurations and pick the one that minimizes the time and overheads. 61