Comp 335 – File Structures Why File Structures? Goal of the Class To develop an understanding of the file I/O process. Software must be able to interact with files on secondary storage devices. Learn primary methods of how files can be structured and organized to aid in efficient information storage and retrieval of the information QUICKLY and EFFICIENTLY! History of File Structures Early in computing history, secondary storage was in the form of magnetic tape and punched cards. Storage was cheap but access was limited to sequential. In 1956, IBM introduced the RAMAC magnetic disk device. It could be leased for $620 month and could store approximately 5 Mb of data. Data could be accessed directly instead of sequentially. Conserving the space on the disk and getting to the data quickly became an area of research. This was the dawn of the study of file structures. Read more about RAMAC History of File Structures Advances in operating systems gave rise to more research in operating systems. The ability to multi-task processes was a new concept. Disk drives were slow (and still are) and became a big bottleneck in software. It was important to come up with ways to speed up the file I/O process. Learn more about disk drives How SLOW is a disk drive? To the human, disk drives are extremely fast, drives today can spin at about 170 mph. An example is the Cheetah 15k.7 Converting RPM to MPH Code to compute RPM to MPH However, when compared to a computer, disk drives are very slow, especially when comparing them to the speed of a CPU and main memory processes. CPU (nanoseconds, one-billionth of a second), Disk accesses (milliseconds, onethousandth of a second) Comparing CPU speed to Disk speed Assume a RAM access is 120 ns Assume a Disk access is 30 ms. How many RAM accesses can be made in the time it takes to do one disk access? Answer: 250,000 Folk and Zoellick (File Structures, An Object-Oriented Approach using C++) mention in human terms that if a RAM access were 20 SECONDS the DISK access would be 250,000 times longer or take 58 DAYS! History of File Structures The development of high level languages moved file processing from a physical level (hardware dependent) to a logical level (software dependent with the OS handling the specifics of the hardware). Code became more portable but lost some in the efficiency area. History of File Structures Today, with shared data everywhere (networks, internets, etc…) efficient file structures are even more crucial. People want data fast. Crowded networks do not need extra bits to transmit. File Structure Technique Chronology Sequential techniques - sequential search thru a file Indexing – traversing more than one file to access data Tree structures AVL – self balancing concept, faster access B-tree – large branching factor from a node Hashing - one access only to desired data achieved by converting a record key to the storage address of the data Techniques continually improved on access time to get to the data. This is the most crucial element in file structure design. A file structure is a combination of data representation on file and the operations for accessing this data. Understanding Disk Drives Terms: Seek Rotational Latency Read/Write Time Platter Read/Write head Spindle speed Cylinder Track Sector Understanding Disk Drives Understanding Disk Drives Understanding Disk Drives Review RAMAC Why are disk drives slow? Purpose of file structures File structure techniques