Lecture 28 Reminder: Homework 6 due today, case study outlines with references due today. Homework 7 is posted, due on Monday at the Exam 2 review. Exam 2 is next Wednesday. Questions? Wednesday, March 21 CS 470 Operating Systems - Lecture 28 1 Outline File system implementation Allocation methods Free space management Other issues Recovery Wednesday, March 21 CS 470 Operating Systems - Lecture 28 2 Allocation Methods As with any resource, there are various ways to allocate disk space into files. Disk logically is a linear block array. Want a method that has effective disk utilization efficient access for a variety of operations; random vs. sequential; read vs. write vs. append Common methods are contiguous, linked, and indexed Wednesday, March 21 CS 470 Operating Systems - Lecture 28 3 Contiguous Allocation Each file occupies contiguous blocks on the disk. The directory entry is <fileID, startBlockAddr, length> What are the advantages of this organization? What are the disadvantages of this organization? Wednesday, March 21 CS 470 Operating Systems - Lecture 28 4 Linked Allocation File is a linked list of disk blocks. Directory entry is <fileID, startBlockAddr, endBlockAddr>. Must read the start block to find the next block number. What are the advantages of this organization? What are the disadvantages of this organization? Wednesday, March 21 CS 470 Operating Systems - Lecture 28 5 Linked Allocation To mitigate some of the disadvantages, can use a file allocation table (FAT - used in DOS/Win). Directory entry is <fileID, startBlock#> FAT is indexed by block number and links the table entries rather than the blocks themselves. Often the FAT is small enough to cache, so can find out what block a random access it in without reading disk. Wednesday, March 21 CS 470 Operating Systems - Lecture 28 6 Indexed Allocation Each file has an index block that is an array of disk addresses to the blocks that make up the file. What are the advantages of this organization? What are the disadvantages of this organization? Wednesday, March 21 CS 470 Operating Systems - Lecture 28 7 Indexed Allocation Several ways to handle large files Link index blocks - lose a lot of advantages Multiple levels of indexing - most files are small, can always have a very, very large file. UNIX UFS - most of the index block is direct address, but third to last entry is two-level (i.e. it points to another index block), the second to last block is three-level, and the last entry is four-level. Good compromise: small files do not pay for multilevel, but very, very large files are supported. Wednesday, March 21 CS 470 Operating Systems - Lecture 28 8 Indexed Allocation - UNIX UFS Wednesday, March 21 CS 470 Operating Systems - Lecture 28 9 Free Space Management Allocating space starting with an empty disk is straightforward. Keeping track of disk space as files are deleted is more problematic. Disk data structure is the free-space list that initially consists of most of the disk. When files are created, blocks are allocated off this list. When a file is deleted, its disk blocks are put back on the free-space list. The implementation of the free-space list must support whatever allocation method is being used. Wednesday, March 21 CS 470 Operating Systems - Lecture 28 10 Bit Map When disks are small, the free-space list often is a bit map (or bit vector). Each block is represented by a bit in the map. Bit is set to 1 if block is free and set to 0 if block is allocated. E.g., if blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17, 18, 25, 26, 27 are free and the rest allocated will have as a bit map: 001111001111110001100000011100000... Wednesday, March 21 CS 470 Operating Systems - Lecture 28 11 Bit Map Advantages: Relatively simple to implement. Most architectures have bit manipulation instructions. Efficient to find first free block or n consecutive free blocks (for contiguous allocation). Look for the first word of the bit map that is not 0, find the first set bit in the word. block # = bits/word * (# 0-value words) + offset of first set bit Wednesday, March 21 CS 470 Operating Systems - Lecture 28 12 Bit Map Disadvantages: To be efficient, need bit map in memory. 1.3GB disk with 512B blocks => >332KB bit map Can cluster blocks into groups, e.g. 4 blocks/bit => ~83KB bit map 40GB disk with 1K blocks => >5MB bit map Not feasible for large disks Wednesday, March 21 CS 470 Operating Systems - Lecture 28 13 Linked List Can use a regular linked-list approach. Head pointer is put in a special place on the disk and cached in memory. Each free block contains pointer to the next free block. Disadvantages: traversing list is not efficient, but not done often; allocating a large file is cumbersome Advantages: can be incorporated into FAT data structure for no extra overhead. Wednesday, March 21 CS 470 Operating Systems - Lecture 28 14 Other Approaches Grouping: can store addresses of n (not necessarily contiguous) free blocks in the first th free block. The n entry points to the next block of addresses. Allows system to find large numbers of free blocks much faster than linked approach Counting: for contiguous or clustered allocation, freeing happens simultaneously to a contiguous set of blocks. Keep track of this with free space entry <diskaddr, count>. Wednesday, March 21 CS 470 Operating Systems - Lecture 28 15 Performance Enhancements Page caching - files are cached as pages in the VM to streamline access Asynchronous writes - most writes are buffered and control returned to the caller. The buffer is written out to disk at a later time. However, certain metadata must be written synchronously to maintain system integrity or to support atomic transactions. Free-behind or read-ahead during sequential access. Wednesday, March 21 CS 470 Operating Systems - Lecture 28 16 Other Issues Where are FCBs and index blocks allocated? Unix inodes are preallocated across a partition. Allocator tries to keep data blocks near their inode block. What is the size of a file pointer (disk address) 16-bits => 64KB maximum 32-bits => 4GB maximum 64-bits => very, very large... But as file pointer gets larger so do the data structures that hold them. FAT16->FAT32... Wednesday, March 21 CS 470 Operating Systems - Lecture 28 17 Recovery Sometimes computers crash and we would like to make sure file system is intact. Two issues. Consistency checking: in-memory cache generally is more up-to-date than on-disk data, since updates are not necessarily written to disk immediately. What happens when computer crashes? Most systems have a consistency checker program (fsck on Unix, chkdisk on Win) that is/can be run after a crash. These programs compare the directory structure with data blocks and free space information. Wednesday, March 21 CS 470 Operating Systems - Lecture 28 18 Recovery Backup and restore: keep backup "copies" of data so that in the event of a failure can restore the data. Typical schedule, n is 7 to 30: Day 0: full backup, copy all files Day 1: incremental backup, copy only those files that have changed since Day 0 Day 2: incremental backup, copy only those files that have changed since Day 1 : Day n-1: incremental backup... Repeat cycle Sometimes weekly incrementals or "mirror" backups that can be "hot-swapped". Main issue is storage requirements. Wednesday, March 21 CS 470 Operating Systems - Lecture 28 19