Chapter 4 Structured Query Language

advertisement
Chapter 10
Storage and File Structure
Yonsei University
1st Semester, 2013
Sanghyun Park
Outline





Overview of Physical Storage Media
Magnetic Disks and Flash Storage
File Organization
Organization of Records in Files
Data Dictionary Storage
Classification of Physical Storage Media

Speed with which data can be accessed

Cost per unit of data

Reliability



Data loss on power failure or system crash
Physical failure of the storage device
Can differentiate storage into:


Volatile storage: loses contents when power is switched off
Non-volatile storage:
 Contents persist even when power is switched off
 Includes secondary and tertiary storage, as well as battery-backed
up main-memory
Storage Hierarchy (1/2)
Storage Hierarchy (2/2)

Primary storage



Secondary storage




Fastest media but volatile
Cache, main memory
Next level in hierarchy, non-volatile, moderately fast access time
Also called on-line storage
Flash memory, magnetic disks
Tertiary storage



Lowest level in hierarchy, non-volatile, slow access time
Also called off-line storage
Optical disk, magnetic tape
Magnetic Hard Disk Mechanism
Optimization of Disk-Block Access (1/2)

Block




A contiguous sequence of sectors from a single track
Data is transferred between disk and main memory in blocks
Sizes range from 512 bytes to several kilobytes
Disk-arm-scheduling algorithms order pending accesses to tracks
so that disk arm movement is minimized (e.g., elevator algorithm)
R6
Inner track
R3
R1
R5
R2
R4
Outer track
Optimization of Disk-Block Access (2/2)

To reduce block-access time, we can organize blocks on disk in a
way that corresponds to how data will be accessed

For example, we can store related information on the same or
nearby cylinders

However, blocks may get scattered all over the disk by insertion or
deletion (fragmentation)

Some systems have utilities for de-fragmentation
Flash Storage (1/2)

There are two types of flash memory, called NAND and NOR flash

NAND flash has a much higher storage capacity for a given cost,
and is widely used for data storage in devices such as cameras,
music players, cell phones, and increasingly laptop computers

Has a lower cost per byte than main memory, in addition to being
non-volatile

Requires page-at-a-time read (page: 512 bytes to 4KB)

Transfer rate is around 20 MB/sec

Solid state disks: use multiple flash storage devices to provide
higher transfer rate of 100 to 200 MB/sec
Flash Storage (2/2)

Once written, a flash page cannot be directly overwritten; instead,
it has be erased and rewritten subsequently

Erase operation is performed on a number of pages, called an
erase block, at once, and takes about 1 to 2 milliseconds

There is a limit to how many times a flash page can be erased,
typically around 100,000 to 1,000,000 times (wear leveling)

Flash memory systems reduce the impact of these problems using
a software layer called FTL (Flash Translation Layer)
File Organization

A database is stored as a collection of files, each file is organized
logically as a sequence of records

Each file is also logically partitioned into fixed-length storage units
called blocks, which are the units of both storage allocation and
data transfer

One approach:




Assume record size is fixed
Each file has records of one particular type only
Different files are used for different relations
This approach is easiest to implement; we will consider variable length
records later
Fixed-Length Records

Simple approach



Store record i starting from byte n  i, where n is the size of each
record
Record access is simple but records may cross blocks
Deletion of record i:
alternatives:



Move records i + 1, . . ., m
to i, . . . , m – 1
Move record m to i
Do not move records, but
link all free records on a
free list
Free Lists




Store the address of the first deleted record in the file header
Use this first record to store the address of the second deleted
record, and so on
Can think of these stored addresses as pointers
More space efficient representation: reuse space for normal
attributes of free records to store pointers
Variable-Length Records

Variable-length records arise in database systems in several ways:




Storage of multiple record types in a file
Record types that allow variable lengths for one or more fields
Record types that allow repeating fields (used in some older data
models)
Byte-string representation





Store each record as a string of consecutive bytes
Attach an end-of-record () control character to the end of each record
Not easy to reuse space occupied formerly by a deleted record
There is no space, in general, for records to grow longer
Thus, the basic byte-string representation is not usually used for
implementing variable-length records
Variable-Length Records:
Slotted Page Structure

Slotted page header contains:





Number of record entries
End of free space in the block
Location and size of each record
Records can be moved around within a page to keep them
contiguous with no empty space between them; entry in the header
must be updated
Pointers should not point directly to record — instead they should
point to the entry for the record in header
Organization of Records in Files

So far, we have studied how records are represented in a file
structure; Given a set of records, the next question is how to
organize them in a file




Heap – a record can be placed anywhere in the file where there is
space
Sequential – store records in sequential order, based on the value of
the “search key” of each record
Hashing – a hash function is computed on some attribute of each
record; the result specifies in which block of the file the record should
be placed
Records of each relation may be stored in a separate file.
In a multitable clustering file organization records of several
different relations can be stored in the same file

Motivation: store related records on the same block to minimize I/O
Sequential File Organization (1/2)




Suitable for applications that require sequential processing of the
entire file
The records in the file are ordered by a search-key
A search key is any attribute or set of attributes; it need not be the
primary key, or even a superkey
To permit fast retrieval of records in search-key order, we chain
together records by pointers
Sequential File Organization (2/2)


Deletion – use pointer chains
Insertion – locate the position where the record is to be inserted




If there is free space in the block, insert there
If no free space, insert the record in an overflow block
In either case, pointer chain must be updated
Need to reorganize the file
from time to time to restore
sequential order
Multitable Clustering File Organization
(1/2)

Simple file structure stores each relation in a separate file

Can instead store several relations in one file using a multitable
clustering file organization

E.g., multitable clustering organization of department and instructor:



Good for queries joining department and instructor relations
Bad for queries involving only department
Results in variable size records
Multitable Clustering File Organization
(2/2)
department
instructor
multitable clustering
of department and
instructor
Data Dictionary Storage (1/2)
Data dictionary (also called system catalog) stores metadata; that is
data about data, such as

Information about relations






User and accounting information, including passwords
Statistical and descriptive data


Number of tuples in each relation
Physical file organization information



Names of relations
Names and types of attributes of each relation
Names and definitions of views
Integrity constraints
How relation is stored (sequential/hash/…)
Physical location of relation
Information about indices (Chapter 11)
Data Dictionary Storage (2/2)
Download