Advance Database System
Lecture # 2
Department of Computer Science
The University of Lahore
April 13, 2015
ADBS: Storage
1
Disk Storage, Basic
File Structures
April 13, 2015
ADBS: Storage
2
Review of Previous Lecture








Overview of Database Design Process
Example Database Application (COMPANY)
ER Model Concepts
Entities and Attributes
Entity Types, Value Sets, and Key Attributes
Relationships and Relationship Types
ER Diagrams - Notation
ER Diagram for COMPANY Schema
April 13, 2015
ADBS: Storage
3
Scope of Today Lecture







The Storage Hierarchy
Disk Storage Devices
Records
Blocking
Files of Records
Unordered Files
Ordered Files
April 13, 2015
ADBS: Storage
4
Storage Medium
 The collection of data that makes up a computerized
database must be stored physically on some computer
storage medium.
 The DBMS software can then retrieve, update, and
process this data as needed.
 Computer storage media form a storage hierarchy that
includes two main categories.
• Primary storage
• Secondary storage
April 13, 2015
ADBS: Storage
5
- The Storage Hierarchy
April 13, 2015
ADBS: Storage
6
Disk Storage Devices
 Preferred secondary storage device for high storage
capacity and low cost.
 Data stored as magnetized areas on magnetic disk
surfaces.
 A disk pack contains several magnetic disks connected
to a rotating spindle.
 Disks are divided into concentric circular tracks on each
disk surface. Track capacities vary typically from 4 to 50
Kbytes.
April 13, 2015
ADBS: Storage
7
Disk Storage Devices

Because a track usually contains a large amount of
information, it is divided into smaller blocks or sectors.

The division of a track into sectors is hard-coded on the
disk surface and cannot be changed. One type of sector
organization calls a portion of a track that subtends a
fixed angle at the center as a sector.

A track is divided into blocks. The block size B is fixed
for each system. Typical block sizes range from B=512
bytes to B=4096 bytes. Whole blocks are transferred
between disk and main memory for processing.
April 13, 2015
ADBS: Storage
8
… - Disk Storage Devices …
April 13, 2015
ADBS: Storage
9
Disk Storage Devices





The division of a track into equal sized disk blocks is set
by the operating system during disk formatting
(initialization of disk).
Disk blocks are separate by fixed size inter block gap,
which included special coded information at time of
initialization.
A disk controller, typically embedded in the disk drive,
control the disk drive and interfaces it to the computer
system.
The time required to find the desire track is called seek
time.
The time required to find the desire block is called
rotational delay
April 13, 2015
ADBS: Storage
10
… - Disk Storage Devices …
April 13, 2015
ADBS: Storage
11
Hard Drive Internal
April 13, 2015
ADBS: Storage
12
Seek/Rotation Time
April 13, 2015
ADBS: Storage
13
Example




Block/sector size B = 512 bytes
Interblock gap size G = 128 bytes
Number of blocks per track = 20
Number of tracks per surface =
400.
 A disk pack consists of 15
double-sided disks.
 What is the total capacity of a
track?
Total Track Capacity = (Block Size + Gap Size) * Total No. Of Block on One Track
Total track capacity = (512 + 128) * 20 = 12800 bytes
April 13, 2015
ADBS: Storage
14
Example





1.
Block/sector size B = 512 bytes
Interblock gap size G = 128 bytes
Number of blocks per track = 20
Number of tracks per surface =
400.
A disk pack consists of 15 doublesided disks.
How many cylinders are
there?
Total No Of Cylinders = Total No Of Tracks Per Surface
Total No Of Cylinders = 400
April 13, 2015
ADBS: Storage
15
Example





1.
Block/sector size B = 512 bytes
Interblock gap size G = 128 bytes
Number of blocks per track = 20
Number of tracks per surface =
400.
A disk pack consists of 15 doublesided disks.
What are the total capacity
of a cylinder ?
Total Capacity of Cylinder = Total Capacity of One Track * Total No of Surfaces of Hard Disk
Total Capacity of Cylinder = 12800 * 30 = 384000 Bytes
April 13, 2015
ADBS: Storage
16
Example





1.
Block/sector size B = 512 bytes
Interblock gap size G = 128 bytes
Number of blocks per track = 20
Number of tracks per surface =
400.
A disk pack consists of 15 doublesided disks.
What are the total capacity
of a disk pack ?
Total Capacity of Disk Pack = 2 (Total No of Tracks per Surface) * Total Track Capacity
Total Capacity of Disk Pack = 2 (400) * 12800 = 10240000Bytes
April 13, 2015
ADBS: Storage
17
Example
The average seek time is = 30 msec
The average rotational delay is = 30 msec.
 Block transfer time per block = btt is 0.17 msec.
How much time does it take (on the average) in msec to
locate and transfer a single block, given its block address?
Average Time to Transfer Single Block = Seek Time + Rotational Delay + Block Transfer Time
Average Time to Transfer Single Block = 30 + 30 + 0.17 = 60.17 msec
April 13, 2015
ADBS: Storage
18
Formulas
rpm = 1000
Rotational delay
rd in msec = (60 * 1000) / (2 * rpm)
Transfer rate
tr = track capacity in bytes / (60 * 1000/rpm)
One Block transfer time
Btt = B / tr
April 13, 2015
ADBS: Storage
19
Buffering of Blocks
We have two schemes in execution
Interleave
Parallel
Single buffering
Double buffering
April 13, 2015
ADBS: Storage
20
Records
 Fixed and variable length records
 Records contain fields which have values of a particular
type (e.g., amount, date, time, age)
 Fields themselves may be fixed length or variable length
 Variable length fields can be mixed into one record:
separator characters or length fields are needed so that
the record can be “parsed”.
April 13, 2015
ADBS: Storage
21
Blocking
 Blocking: refers to storing a number of records in one
block on the disk.
 Blocking factor (bfr) refers to the number of records per
block.
[bfr = flooring (B/R)]
 There may be empty space in a block if an integral
number of records do not fit in one block.
[B - bfr * R]
 Spanned Records: refer to records that exceed the size
of one or more blocks and hence span a number of
blocks.
April 13, 2015
ADBS: Storage
22
Files of Records
 A file is a sequence of records, where each record is a
collection of data values (or data items).
 A file descriptor (or file header ) includes information
that describes the file, such as the field names and
their data types, and the addresses of the file blocks on
disk.
 Records are stored on disk blocks. The blocking factor
bfr for a file is the (average) number of file records
stored in a disk block.
 A file can have fixed-length records or variable-length
records.
April 13, 2015
ADBS: Storage
23
Files of Records
 File records can be unspanned (no record can span two
blocks) or spanned (a record can be stored in more than
one block).
 The physical disk blocks that are allocated to hold the
records of a file can be contiguous, linked, or indexed.
 In a file of fixed-length records, all records have the
same format.
 Files of variable-length records require additional
information to be stored in each record, such as
separator characters and field types.
April 13, 2015
ADBS: Storage
24
Unordered Files
 Also called a heap or a pile file.
 New records are inserted at the end of the file.
 To search for a record, a linear search through the file
records is necessary. This requires reading and
searching half the file blocks on the average, and is
hence quite expensive.
 Record insertion is quite efficient.
 Reading the records in order of a particular field
requires sorting the file records.
April 13, 2015
ADBS: Storage
25
Ordered Files
 Also called a sequential file.
 File records are kept sorted by the values of an ordering
field.
 Insertion is expensive: records must be inserted in the
correct order.
 A binary search can be used to search for a record on
its ordering field value. This requires reading and
searching log2 of the file blocks on the average, an
improvement over linear search.
 Reading the records in order of the ordering field is quite
efficient.
April 13, 2015
ADBS: Storage
26
Ordered Files
April 13, 2015
ADBS: Storage
27
Average Access Times
 The following table shows the average access time to
access a specific record for a given type of file
April 13, 2015
ADBS: Storage
28
Example
r = 20,000 STUDENT records of fixed
length. Each record has the following
fields











NAME (30 bytes),
SSN (9 bytes),
ADDRESS (40 bytes),
PHDNE (9 bytes),
BIRTHDATE (8 bytes),
SEX (l byte),
MAJORDEPTCODE (4 bytes),
MINORDEPTCODE (4 bytes),
CLASSCODE (4 bytes, integer),
and DEGREEPROGRAM (3 bytes).
An additional byte is used as a
deletion marker.
 Block size B = 512 bytes
April 13, 2015
ADBS: Storage
Calculate the record size R in
bytes?
R = 30+9+40+9+8+1+4+4+4+3+1 = 113 bytes
Calculate the blocking factor bfr and
the number of file blocks b,
assuming
an
unspanned
organization
Bfr = flooring(B/R)
Bfr = flooring(512/113) = 4/block.
b = ceiling (r/bfr = ceiling(20000/4)) = 5000 block
29
Term Paper / Project Directions





Implementing Hashing
Other Hashing Techniques in Dynamic Hashing
File Organization in OODBMS
File Organization in Spatial DBMS
File Organization in Multi-Media DBMS
April 13, 2015
ADBS: Storage
30
Review of Lecture







The Storage Hierarchy
Disk Storage Devices
Records
Blocking
Files of Records
Unordered Files
Ordered Files
April 13, 2015
ADBS: Storage
31