Database Management 6. course OS and DBMS DDL DML DB OS DBMS DML DBA W H I DMBS S H E S USER R U L E S Steps of a query 1. 2. 3. 4. 5. 6. 7. 8. 9. SQL query Permission in the schema? Permission in the subschema? I/O operation Search Import Notification User workspace User notification Data storage: disks and files Data storage: disks and files • Mass storage device (disc, drive) • I/O – READ: disc memory (RAM) – WRITE: memory disc – Time consuming Why not storing everything in RAM? • Expenses of 1GB: RAM 10 € ↔HDD 0,5 € • RAM volatilis • Tipical way of storage: – Actual data is in memory – Secondary storage is on HDD (local server, cloud) – Tertiary storage Storage on disks • Unit: disc block • Speed depends on location! Components Reading a block • Access time of a block: – Seek time – Rotational delay – Transfer time: 1ms/4KB • I/O optimization: reducing seek time and rotational delay Order of data • Frequently used blocks close to each other – Same block – Same track, same cylinder – Adjacent cylinder • Reading is sequential • Multiple block reading saves time Way of storage - RAID • Redundant Array of Inexpensive/Independent Data • Connecting disks logically, storing data redundantly • Aims: – Minimizing data loss, increase reliability – Increasing capacity by more smaller/cheaper disks – Increase data access performance – Increase flexibility (can be replaced during usage) Two main techniques • Data striping – Data is partitioned (striping unit) – Partitions are distributed on several disks • Redundancy – Reconstruction of data Level 0 • • • • Non redundant If one of the disks fails, data is lost Parallel reading/writing Performance depends on the worst disk Level 1 • • • • • Mirrored Data can be reconstructed Parallel reading, increased velocity Parallel writing, normal velocity Performance depends on the worst disk • Does not use data striping Level 2 • Data striping (unit=1 bit), error-correcting codes • ECC: redundant bits calculated from data bits (compress) • Not used any more Level 3 • • • • • • Bit-Interleaved Parity Cannot identify the failed disk One check disk with parity information The failed disk’s data can be recovered Can process only one I/O at a time Strip=1 bit Level 4 • • • • • Block-Interleaved Parity Like RAID 3, strip=disk blocks Supports multiple users Parity disk update can be bottle neck In case of disk failure, reading speed reduces Level 5 • • • • Block-Interleaved Distributed Parity Rotating parity Parallel read and write Similar to RAID 3 and 4 depending on the size of strips • If a disks fails, it has to be replaced inmediately RAID 5 • Capacity= min_capacity*(no of disks-1) • Reading speed=min_speed*(no of disks-1) Level 6 • • • • High possibility of the failure during recovery 2 check disks Recover from up to two disk failures Read and write speed is equal to RAID 5 RAID 0+1 and RAID 10 • RAID 0+1 • RAID 10 Disk space and buffering Disk space management • • • • The lowest level of DBMS manages the space Unit of data: page Size of page=size of disk block Higher levels can – Allocate and delete pages – Write / read pages • Allows higher levels of DBMS to think of the data as a collection of pages Keeping track of free blocks • Maintain a list of free blocks with pointer to the first free block OR • Maintain a bitmap with one bit for each block: block is used or not Using OS to manage disk space • Possible, not common • Not portable: different file system • On 32-bit systems the largest file size is 4GB, OS files cannot span disk devices Buffer manager Page requests BUFFER POOL page free frame Memory Disc DB If a requested page is not in the pool and the pool is full, the buffer manager’s replacement policy controls which existing page is replaced. • Data has to be imported into the memory (RAM) to use it • <frame#, pageid> pares are stored in tables When a request comes… • If the page is not in the buffer: – Choose a frame to replace, incerase its pin count – If the dirty bit for the replacement frame is on, write the content on the disk – Reads the requested page into the replacement frame • Return the address of the frame to the requestor • If it can be predicted that which page will be requested next, then multiple pages can be read (pre-fetching) Buffer management • The requestor has to unpin the request • Mark if the content of the page is modified – With the dirty bit • The page in the buffer can be called multiple times by processes/transactions – Pin_count: page can be replaced if and only if pin_count=0 • Concurrency handling and rollback handling can influence the replacement policy Buffer replacement policies • Least-recently-used (LRU): counts what was used and when (costs a lot) • Clock replacement – Current frame is stored Goes to the next until pin count=0 and referenced bit is off (not used) – After the last, jumps to the first (like a circle) Files and indexes Records in files • DBMS handles records and files • Files: collection of pages containing records • They must support – DML (insert, update, delete) – Read records (identified by record id – rid) – Read all the records (that satisfy some conditions) Unordered (heap) files • Simplest file structure • DBMS must register – pages in the file – free space in the page – records in the page Heap file as a linked list Data Page Data Page Data Page Full pages Header Page Data Page Data Page Data Page • Every page contains two pointers Pages with free space • Disadvantages – Every page is in the list of free records if they have variable length – To insert a record, we must examine several pages before finding enough space Directory-based heap file Header Page • Maintain directory of pages • DBMS stores the address of the first page DIRECTORY of each heap file • Directory=collection of pages • Counter for every page: amount of free space/entry Data Page 1 Data Page 2 Data Page N Index • Read the records sequentially • Search for a concrete rid • Records with specific conditions for its attributes (e.g. all CLERCKs) • Value-based queries Example, library 1. lokate books of Asimov 2. Search for Foundation • Indexed file: Give a search key for the entries (records in files), calculate the index of this key, look for it • Goal: speed up search • E.g. I am looking for employees of a given age, then I can build an index which might contain <age,rid> pairs • The pages of the index files are organized based on the indexes to find the result quickly (access methods) Access methods • • • • B trees B+ trees Hash-based structures Discussed in detail later Page formats • Data as a collection of records • Page~collection of slots, each slot contains a record • Record identification: – <page id, slot number>=rid – Number every record and store its location in a table Fixed-length records • All records have the same length • Insertion: locate empty slot, place there • Main issue: – Keep track of empty slots – Locate all records on a page Deletion alternatives – first option • Store records in the first N slots without gap • If a record is deleted, the last record is moved to the gap • Advantage: finding location is easy (just offset calculation) • The empty slots remain together at the end of the page • Disadvantge: if the moved record is referred externally (the rid changes) Second option • Using an array of bits, one bit/slot • If record is deleted, its bit turns off • Summary: Every page contains additional filelevel info Variable-length records • If new record is to be inserted, enough and not too big space is needed (do not waste) • If deleted, move the others to fill the hole • Most flexible organization: directory of slots for each page Directory of slots • Offset (pointer) and length of the records are stored • Deletion: set offset to -1 • Records can be moved since rid=(page number,slot number[position in the directory]) does not change • Only the record offset changes • The offset of the free space is stored • When new record is inserted and there is not enough space, records are moved • If a record is deleted the number of the rest record cannot be changed due to external references • If a record is inserted, a missing number should be given to it Record formats • Number of fields and field types are stored in the system catalog Fixed-length records • Each field has fixed length (uniform for every record) • By the offset of the record the offset of each field can be calculated easily: F1 L1 Base address (B) F2 F3 F4 L2 L3 L4 Address = B+L1+L2 Variable-length records • Variable length fields (e.g. varchar2) • Two formats: – Separators are used: scan of the record is needed for reading the fields – Array of integer offsets at the beginning of the record to store the relative position of the fields and the end of the record: • The offset of the end of the record is stored • Disadvantage – Storage overhead • Advantages – Direct access to the fields – NULL: start of the field=end of the field Issues • When insert, move the other fields • When modify, move the other fields – Page modification may cause a problem – Forwarding address is left on the page • When a record is too big for one page – Break record to smaller records – Chain them Thank you for your attention!