Comp 335 – File Structures
Why File Structures?
Goal of the Class


To develop an understanding of the file I/O
process. Software must be able to interact
with files on secondary storage devices.
Learn primary methods of how files can be
structured and organized to aid in efficient
information storage and retrieval of the
information QUICKLY and EFFICIENTLY!
History of File Structures


Early in computing history, secondary storage was
in the form of magnetic tape and punched cards.
Storage was cheap but access was limited to
sequential.
In 1956, IBM introduced the RAMAC magnetic disk
device. It could be leased for $620 month and
could store approximately 5 Mb of data. Data
could be accessed directly instead of
sequentially. Conserving the space on the disk
and getting to the data quickly became an area of
research. This was the dawn of the study of file
structures.
Read more about RAMAC
History of File Structures

Advances in operating systems gave rise to
more research in operating systems. The
ability to multi-task processes was a new
concept. Disk drives were slow (and still
are) and became a big bottleneck in
software. It was important to come up with
ways to speed up the file I/O process.

Learn more about disk drives
How SLOW is a disk drive?

To the human, disk drives are extremely
fast, drives today can spin at about 170
mph. An example is the Cheetah 15k.7
Converting RPM to MPH


Code to compute RPM to MPH
However, when compared to a computer,
disk drives are very slow, especially when
comparing them to the speed of a CPU and
main memory processes.
CPU (nanoseconds, one-billionth of a
second), Disk accesses (milliseconds, onethousandth of a second)
Comparing CPU speed to
Disk speed
Assume a RAM access is 120 ns
Assume a Disk access is 30 ms.
How many RAM accesses can be made in the time it
takes to do one disk access?
Answer: 250,000
Folk and Zoellick (File Structures, An Object-Oriented
Approach using C++) mention in human terms that
if a RAM access were 20 SECONDS the DISK
access would be 250,000 times longer or take 58
DAYS!
History of File Structures


The development of high level
languages moved file processing from
a physical level (hardware dependent)
to a logical level (software dependent
with the OS handling the specifics of
the hardware).
Code became more portable but lost
some in the efficiency area.
History of File Structures

Today, with shared data everywhere
(networks, internets, etc…) efficient file
structures are even more crucial.
People want data fast. Crowded
networks do not need extra bits to
transmit.
File Structure Technique
Chronology




Sequential techniques - sequential search thru a file
Indexing – traversing more than one file to access data
Tree structures

AVL – self balancing concept, faster access

B-tree – large branching factor from a node
Hashing - one access only to desired data achieved by
converting a record key to the storage address of the data
Techniques continually improved on access time to get to the
data. This is the most crucial element in file structure design.
A file structure is a combination of data representation on file and
the operations for accessing this data.
Understanding Disk Drives

Terms:









Seek
Rotational Latency
Read/Write Time
Platter
Read/Write head
Spindle speed
Cylinder
Track
Sector
Understanding Disk Drives
Understanding Disk Drives
Understanding Disk Drives
Review




RAMAC
Why are disk drives slow?
Purpose of file structures
File structure techniques