CS 405G: Introduction to Database Systems 20. Concurrent control, Storage

advertisement
CS 405G: Introduction to
Database Systems
20.
Concurrent control, Storage
Presentation Schedule








Apr 20 (1 day extension for final report)
Wilcox, Chung, Zakaria
Claytor, Allison
Akridge, Jenkins, Dunbar
Hill, Weisbrod, Caudill
Vinogradov, Brothers
M. Turner, Ruffner, Dunbar (?)
Dale, Bentley
7/1/2016
Chen Qian @ University of Kentucky
2
Presentation Schedule








Apr 22 (1 day extension for final report)
R. Tuner
Resch
Williams
Roberts
Wise, Zachary, Almon
Lewis
Joiner
7/1/2016
Chen Qian @ University of Kentucky
3
Presentation Schedule








Apr 25
Amir
Johnson
Zheng
Morganett, Teague
Fullington, O’Kane
Ferry
Gant
7/1/2016
Chen Qian @ University of Kentucky
4
Presentation Schedule








Apr 27
Becker
Moustafa
Houghton
James
Campbell
Underwood
Bishop
7/1/2016
Chen Qian @ University of Kentucky
5
Storage

Most on disks!


That’s why we always draw databases as
And why the single most important metric in database
processing is the number of disk I/Os performed
7/1/2016
Chen Qian @ University of Kentucky
6
The Storage Hierarchy
Main
memory (RAM) for
currently used data
Disk for the main database
(secondary storage).
Tapes for archiving older
versions of the data (tertiary
storage).
Smaller, Faster
Bigger, Slower
Source: Operating Systems Concepts 5th Edition
7/1/2016
Chen Qian @ University of Kentucky
7
A typical disk
Tracks
Disk arm
Platter
Arm movement
7/1/2016
Spindle
Disk head
Spindle rotation
Chen Qian @ University of Kentucky
Cylinders
“Moving parts” are slow
8
Disk access time
Sum of:
 Seek time: time for disk heads to move to the correct
track
 Rotational delay: time for the desired block to rotate
under the disk head
 Transfer time: time to read/write data in the block (=
time for disk to rotate over the block)
7/1/2016
Chen Qian @ University of Kentucky
9
Random disk access
Seek time + rotational delay + transfer time
 Average seek time




Average rotational delay



Time to skip one half of the tracks?
Not quite; should be time to skip a third of them
“Typical” value: 5 ms
Time for a half rotation (a function of RPM)
“Typical” value: 4.2 ms (7200 RPM)
Typical transfer time

.08msec per 8K block
7/1/2016
Chen Qian @ University of Kentucky
10
Sequential Disk Access Improves Performance
Seek time + rotational delay + transfer time
 Seek time


Rotational delay


0 (assuming data is on the same track)
0 (assuming data is in the next block on the track)
Easily an order of magnitude faster than random disk
access!
7/1/2016
Chen Qian @ University of Kentucky
11
Performance tricks





Disk layout strategy
 Keep related things (what are they?) close together: same
sector/block ! same track ! same cylinder ! adjacent
cylinder
Double buffering
 While processing the current block in memory, prefetch
the next block from disk (overlap I/O with processing)
Disk scheduling algorithm
Track buffer
 Read/write one entire track at a time
Parallel I/O
 More disk heads working at the same time
7/1/2016
Chen Qian @ University of Kentucky
12
Records and Files

A row in a table, when located on disks, is called


A record
Two types of record:


Fixed-length
Variable-length
7/1/2016
Chen Qian @ University of Kentucky
13

In an abstract sense, a file is


In reality, a file is


A set of disk pages
Each record lives on


A set of “records” on a disk
A page
Physical Record ID (RID)

A tuple of <page#, slot#>
7/1/2016
Chen Qian @ University of Kentucky
14
Files


Higher levels of DBMS operate on records, and files of
records.
FILE: A collection of pages, each containing a collection of
records.


Record = row in a table
Must support:



insert/delete/modify record
fetch a particular record (specified using record id)
scan all records (possibly with some conditions on the records
to be retrieved)
7/1/2016
Chen Qian @ University of Kentucky
15
Unordered (Heap) Files



Simplest file structure contains records in no particular
order.
As file grows and shrinks, disk pages are allocated and
de-allocated.
To support record level operations, we must:




keep track of the pages in a file
keep track of free space on pages
keep track of the records on a page
There are many alternatives for keeping track of this.
 We’ll consider 2
7/1/2016
Chen Qian @ University of Kentucky
16
Heap File Implemented as a List
Data
Page
Data
Page
Data
Page
Full Pages
Header
Page
Data
Page


Data
Page
Data
Page
Pages with
Free Space
The header page id and Heap file name must be stored
someplace.
 Database “catalog”
Each page contains 2 `pointers’ plus data.
7/1/2016
Chen Qian @ University of Kentucky
17
Heap File Using a Page Directory
Data
Page 1
Header
Page
Data
Page 2
DIRECTORY



7/1/2016
Data
Page N
DBMS must remember where the first directory page is located.
The entry for a page can include the number of free bytes on the
page.
The directory itself is a collection of pages and shown as a
linked list.
 Much smaller than linked list of all HF pages!
Chen Qian @ University of Kentucky
18
Trade-off

Compared to linked list, the directory based approaches
have:


Pros: Fast access to a particular record
Cons: Extra space
7/1/2016
Chen Qian @ University of Kentucky
19
Summary

Disks provide cheap, non-volatile storage.
 Random access, but cost depends on the location of page
on disk; important to arrange data sequentially to
minimize seek and rotation delays.
7/1/2016
Chen Qian @ University of Kentucky
20
Record layout
Record = row in a table
 Variable-format records



Rare in DBMS—table schema dictates the format
Relevant for semi-structured data such as XML
Focus on fixed-format records


With fixed-length fields only, or
With possible variable-length fields
7/1/2016
Chen Qian @ University of Kentucky
21
Record Formats: Fixed Length
F1
F2
F3
F4
L1
L2
L3
L4
Base address (B)

All field lengths and offsets are constant


Address = B+L1+L2
Computed from schema, stored in the system catalog
Finding i’th field done via arithmetic.
7/1/2016
Chen Qian @ University of Kentucky
22
Fixed-length fields

Example: CREATE
TABLE Student(SID INT, name
CHAR(20), age INT, GPA FLOAT);
0

Watch out for alignment


4
24 28
36
142 Bart (padded with space) 10
2.3
May need to pad; reorder columns if that helps
What about NULL?

Add a bitmap at the beginning of the record
7/1/2016
Chen Qian @ University of Kentucky
23
Record Formats: Variable Length

Two alternative formats (# fields is fixed):
F1
F2
F3
$
F4
$
$
$
Fields Delimited by Special Symbols
F1
F2
F3
F4
Array of Field Offsets
 Second offers direct access to i’th field, efficient storage
of nulls; extra but small directory overhead.
7/1/2016
Chen Qian @ University of Kentucky
24
Trade-off

Compared to the 1st formats, the offset based format has:


Pros: Fast access to a particular field; efficient storage of
nulls
Cons: Extra space
7/1/2016
Chen Qian @ University of Kentucky
25
Large Object (LOB) fields

Example: CREATE

Student records get “de-clustered”
TABLE Student(SID INT, name
CHAR(20), age INT, GPA FLOAT, picture BLOB(32000));


Bad because most queries do not involve picture
Decomposition (automatically done by DBMS and
transparent to the user)


Student(SID, name, age, GPA)
StudentPicture(SID, picture)
7/1/2016
Chen Qian @ University of Kentucky
26
Page layout
How do you organize records in a page?
 Fixed length records
 Variable length records

NSM (N-ary Storage Model) is used in most commercial
DBMS
7/1/2016
Chen Qian @ University of Kentucky
27
Page Formats: Fixed Length Records
Slot 1
Slot 2
Slot 1
Slot 2
Free
Space
...
...
Slot N
Slot N
Slot M
1 . . . 0 1 1M
N
PACKED

number
of records
M ... 3 2 1
UNPACKED, BITMAP
number
of slots
Record id = <page id, slot #>. In first alternative, moving
records for free space management changes rid; may not be
acceptable.
7/1/2016
Chen Qian @ University of Kentucky
28
Trade-off

Compared to the packed formats, the unpacked format
has:


Pros: No need to move records into vacated slots after
delete. No need to change the slot number.
Cons: To insert a record, one needs to find an empty slot,
by scanning the bitmap.
7/1/2016
Chen Qian @ University of Kentucky
29
Variable-Length Records


Store records from the beginning of each page
Use a directory at the end of each page


To locate records and manage free space
Necessary for variable-length records
142 Bart
857 Lisa
10 2.3
123 Milhouse 10 3.1
8 4.3
456 Ralph
8 2.3
Why store data and directory
at two different ends?
Both can grow easily
7/1/2016
Chen Qian @ University of Kentucky
30
Options

Reorganize after every update/delete to avoid
fragmentation (gaps between records)


Need to rewrite half of the block on average
What if records are fixed-length?

Reorganize after delete



Only need to move one record
Need a pointer to the beginning of free space
Do not reorganize after update

7/1/2016
Need a bitmap indicating which slots are in use
Chen Qian @ University of Kentucky
31
System Catalogs

For each relation:





For each index:


structure (e.g., B+ tree) and search key fields
For each view:


name, file location, file structure (e.g., Heap file)
attribute name and type, for each attribute
index name, for each index
integrity constraints
view name and definition
Plus statistics, authorization, buffer pool size, etc.
Catalogs are themselves stored as relations!
7/1/2016
Chen Qian @ University of Kentucky
32
Indexes

A Heap file allows us to retrieve records:



Sometimes, we want to retrieve records by
specifying the values in one or more fields, e.g.,



by specifying the rid, or
by scanning all records sequentially
Find all students in the “CS” department
Find all students with a gpa > 3
Indexes are file structures that enable us to answer
such value-based queries efficiently.
7/1/2016
Chen Qian @ University of Kentucky
33
Basics

Given a value, locate the record(s) with this value
SELECT * FROM R WHERE A = value;
SELECT * FROM R, S WHERE R.A = S.B;

Other search criteria, e.g.
Range search
SELECT * FROM R WHERE A > value;
 Keyword search

database indexing
7/1/2016
Chen Qian @ University of Kentucky
Search
34
Dense and sparse indexes

Dense: one index entry for each search key value

Sparse: one index entry for each block

Records must be clustered according to the search key
7/1/2016
Chen Qian @ University of Kentucky
35
Dense versus sparse indexes

Index size


Requirement on records


Records must be clustered for sparse index
Lookup



Sparse index is smaller
Sparse index is smaller and may fit in memory
Dense index can directly tell if a record exists
Update

Easier for sparse index
7/1/2016
Chen Qian @ University of Kentucky
36
Primary and secondary indexes

Primary index




Secondary index


Created for the primary key of a table
Records are usually clustered according to the primary key
Can be sparse
Usually dense
SQL


PRIMARY KEY declaration automatically creates a primary
index, UNIQUE key automatically creates a secondary index
Additional secondary index can be created on non-key
attribute(s)
CREATE INDEX StudentGPAIndex ON
Student(GPA);
7/1/2016
Chen Qian @ University of Kentucky
37
Tree-Structured Indexes: Introduction


Tree-structured indexing techniques support both range
selections and equality selections.
ISAM =Indexed Sequential Access Method


static structure; early index technology.
B+ tree: dynamic, adjusts gracefully under inserts and
deletes.
7/1/2016
Chen Qian @ University of Kentucky
38
Motivation for Index


``Find all students with gpa > 3.0’’
 If data file is sorted, do binary search
 Cost of binary search in a database can be quite high,
Why?
Simple idea: Create an `index’ file.
index entry
P0
Page 1
K 1
P1
Index File
kN
k1 k2
K 2
Page 2
P 2
Page 3
K m Pm
Page N
Data File
Can do binary search on (smaller) index file!
7/1/2016
Chen Qian @ University of Kentucky
39
ISAM

What if an index is still too big?


Put a another (sparse) index on top of that!
ISAM (Index Sequential Access Method), more or less
Example: look up 197
100, 200, …, 901
Index blocks 100, 123, …, 192 200, …
…
901, …, 996
100, 108,123, 129,
192, 197,200, 202,
901, 907, 996, 997,
…
…
……
119, 121 …
…
…
…
Data blocks
7/1/2016
Chen Qian @ University of Kentucky
40
How many steps?

For a balanced tree of N pages, the num of search steps
is O(logN)

Less than a constant times logN
7/1/2016
Chen Qian @ University of Kentucky
41
Updates with ISAM
Example: insert 107
Example: delete 129
100, 200, …, 901
Index blocks 100, 123, …, 192 200, …
…
901, …, 996
100, 108,123, 129,
192, 197,200, 202,
901, 907, 996, 997,
…
…
……
119, 121 …
…
…
…
Data blocks
107
Overflow block

Overflow chains and empty data blocks degrade
performance

Worst case: most records go into one long chain
7/1/2016
Chen Qian @ University of Kentucky
42
How many steps?

For a chain of N pages, the worst-case num of search
steps is N

Much larger than O(logN)!
7/1/2016
Chen Qian @ University of Kentucky
43
A Note of Caution

ISAM is an old-fashioned idea



B+-trees are usually better, as we’ll see
But, ISAM is a good place to start to understand the
idea of indexing
Upshot


7/1/2016
Don’t brag about being an ISAM expert on your
resume
Do understand how they work, and tradeoffs with B+trees
Chen Qian @ University of Kentucky
44
Summary (Contd.)

DBMS vs. OS File Support


7/1/2016
DBMS needs features not found in many OSes, e.g.,
forcing a page to disk, controlling the order of page
writes to disk, files spanning disks, ability to control
pre-fetching and page replacement policy based on
predictable access patterns, etc.
Variable length record format with field offset
directory offers support for direct access to ith field
and null values.
Chen Qian @ University of Kentucky
45
Download