Chapter 11: File System Implementation Operating System Concepts Essentials – 2nd Edition Silberschatz, Galvin and Gagne ©2013 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Operating System Concepts Essentials – 2nd Edition 11.2 Silberschatz, Galvin and Gagne ©2013 Objectives Describe details of implementing local file systems and directory structures Describe implementation of remote file systems Discuss block allocation and free-block algorithms Operating System Concepts Essentials – 2nd Edition 11.3 Silberschatz, Galvin and Gagne ©2013 File-System Structure File structure Logical storage unit Collection of related information File system resides on secondary storage (disks) User interface to storage, maps logical to physical Efficient and convenient access to disk: allows data to be stored, located, and retrieved easily Disk provides in-place rewrite and random access I/O transfers performed in blocks of sectors usually 512 bytes Operating System Concepts Essentials – 2nd Edition 11.4 Silberschatz, Galvin and Gagne ©2013 File-System Structure File control block (FCB) – storage structure consisting of info about a file Device driver controls physical device File system organized into layers Operating System Concepts Essentials – 2nd Edition 11.5 Silberschatz, Galvin and Gagne ©2013 Layered File System Operating System Concepts Essentials – 2nd Edition 11.6 Silberschatz, Galvin and Gagne ©2013 File System Layers Device drivers manage I/O devices at I/O control layer Outputs low-level hardware specific commands to hardware controller E.g., “read drive1, cylinder 72, track 2, sector 10, into memory location 1060” Operating System Concepts Essentials – 2nd Edition 11.7 Silberschatz, Galvin and Gagne ©2013 File System Layers Basic file system translates command “retrieve block 123” to device driver Also manages memory buffers and caches (allocation, freeing, replacement) Buffers Caches hold data in transit hold frequently used data Operating System Concepts Essentials – 2nd Edition 11.8 Silberschatz, Galvin and Gagne ©2013 File System Layers File organization module understands files, logical address, and physical blocks Translates logical block # => physical block # Manages free space, disk allocation Operating System Concepts Essentials – 2nd Edition 11.9 Silberschatz, Galvin and Gagne ©2013 File System Layers Logical file system manages metadata info Translates file name into file number, file handle, location (by maintaining file control blocks) E.g., inodes in UNIX Directory management Protection Operating System Concepts Essentials – 2nd Edition 11.10 Silberschatz, Galvin and Gagne ©2013 File System Layers Many file systems, sometimes many within OS Each with its own format, e.g., CD-ROM Unix has UFS, FFS Windows Linux – is ISO 9660 has FAT, FAT32, NTFS has more than 40 types E.g., extended file system ext2 and ext3 New ones still arriving, e.g., ZFS, GoogleFS, Oracle ASM, FUSE Operating System Concepts Essentials – 2nd Edition 11.11 Silberschatz, Galvin and Gagne ©2013 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Operating System Concepts Essentials – 2nd Edition 11.12 Silberschatz, Galvin and Gagne ©2013 File-System Implementation We have system calls at API level, but how do we implement their functions? On-disk and in-memory structures Directory structure organizes the files Names and inode numbers, master file table Operating System Concepts Essentials – 2nd Edition 11.13 Silberschatz, Galvin and Gagne ©2013 File-System Implementation (Cont.) File Control Block (FCB) contains info about file inode number, permissions, size, dates Operating System Concepts Essentials – 2nd Edition 11.14 Silberschatz, Galvin and Gagne ©2013 In-Memory File System Structures open a file: Operating System Concepts Essentials – 2nd Edition 11.15 Silberschatz, Galvin and Gagne ©2013 In-Memory File System Structures read a file: Operating System Concepts Essentials – 2nd Edition 11.16 Silberschatz, Galvin and Gagne ©2013 Partitions and Mounting Partition can be: “Cooked” - volume contains a file system Raw – sequence of blocks with no file system Boot block can point to: Boot volume – contains OS Boot loader – set of blocks with code that loads kernel from file system Boot management program for multi-OS booting E.g., GRUB Operating System Concepts Essentials – 2nd Edition 11.17 Silberschatz, Galvin and Gagne ©2013 Partitions and Mounting Root partition contains OS, other partitions can hold other OSs, other file systems, or be raw Mounted at boot time Other partitions can mount automatically or manually At mount time, file system consistency checked Is all metadata correct? If not, fix it, try again If yes, add to mount table, allow access Operating System Concepts Essentials – 2nd Edition 11.18 Silberschatz, Galvin and Gagne ©2013 Virtual File Systems Virtual File Systems (VFS) - object-oriented way of implementing file systems VFS allows same system call interface (the API) to be used on different types of file systems Separates generic file-system operations from implementation details Supports multiple file system types (including networked) Dispatches operation to appropriate file system implementation routines Operating System Concepts Essentials – 2nd Edition 11.19 Silberschatz, Galvin and Gagne ©2013 Virtual File Systems (Cont.) API is to VFS interface, rather than any specific type of file system Operating System Concepts Essentials – 2nd Edition 11.20 Silberschatz, Galvin and Gagne ©2013 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Operating System Concepts Essentials – 2nd Edition 11.21 Silberschatz, Galvin and Gagne ©2013 Directory Implementation Linear list of file names with pointer to data blocks Simple to program Time-consuming to execute Linear search time Could order alphabetically via linked list or use B+ tree Operating System Concepts Essentials – 2nd Edition 11.22 Silberschatz, Galvin and Gagne ©2013 Directory Implementation Hash Table – linear list with hash data structure Decreases directory search time Collisions – situations where two file names hash to the same location Best if entries are fixed size, or use chainedoverflow method Operating System Concepts Essentials – 2nd Edition 11.23 Silberschatz, Galvin and Gagne ©2013 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Operating System Concepts Essentials – 2nd Edition 11.24 Silberschatz, Galvin and Gagne ©2013 Allocation Methods - Contiguous Allocation method: how disk blocks are allocated for files Contiguous allocation – each file occupies set of contiguous blocks Best performance in most cases Simple – only starting location (block #) and length (number of blocks) are required Problems include: finding space for file, knowing file size, external fragmentation, need for compaction off-line (downtime) or on-line Operating System Concepts Essentials – 2nd Edition 11.25 Silberschatz, Galvin and Gagne ©2013 Contiguous Allocation Mapping from logical to physical Operating System Concepts Essentials – 2nd Edition 11.26 Silberschatz, Galvin and Gagne ©2013 Extent-Based Systems Some newer file systems (e.g., Veritas) use a modified contiguous allocation scheme Allocate disk blocks in extents Extent: contiguous blocks on disks Extents are allocated for file allocation File consists of one or more extents Operating System Concepts Essentials – 2nd Edition 11.27 Silberschatz, Galvin and Gagne ©2013 Allocation Methods - Linked Linked allocation – file is linked list of blocks File ends at NULL pointer No external fragmentation Each block contains pointer to next block No compaction, external fragmentation Free space management system called when new block needed Clustering blocks into groups can improve efficiency but increases internal fragmentation Reliability can be problem Locating block can take many I/Os and disk seeks Operating System Concepts Essentials – 2nd Edition 11.28 Silberschatz, Galvin and Gagne ©2013 Allocation Methods – Linked (Cont.) FAT (File Allocation Table) variation Beginning of volume has table, indexed by block number Much like linked list, but faster on disk and cacheable New block allocation simple Operating System Concepts Essentials – 2nd Edition 11.29 Silberschatz, Galvin and Gagne ©2013 Linked Allocation Each file is linked list of disk blocks: Blocks may be scattered anywhere on disk Operating System Concepts Essentials – 2nd Edition 11.30 Silberschatz, Galvin and Gagne ©2013 File-Allocation Table Operating System Concepts Essentials – 2nd Edition 11.31 Silberschatz, Galvin and Gagne ©2013 Allocation Methods - Indexed Indexed allocation- each file has index block(s) of pointers to data blocks Logical view index table Operating System Concepts Essentials – 2nd Edition 11.32 Silberschatz, Galvin and Gagne ©2013 Example of Indexed Allocation Operating System Concepts Essentials – 2nd Edition 11.33 Silberschatz, Galvin and Gagne ©2013 Indexed Allocation (Cont.) Need index table Random access Dynamic access without external fragmentation Index bock adds overhead Operating System Concepts Essentials – 2nd Edition 11.34 Silberschatz, Galvin and Gagne ©2013 Indexed Allocation – Mapping (Cont.) Linked scheme – link blocks of index table (no limit on size) Two-level index 4K blocks could store 1,024 four-byte pointers in outer index 1,048,567 data blocks File size up to 4GB Operating System Concepts Essentials – 2nd Edition 11.35 Silberschatz, Galvin and Gagne ©2013 Indexed Allocation – Mapping (Cont.) Operating System Concepts Essentials – 2nd Edition 11.36 Silberschatz, Galvin and Gagne ©2013 Combined Scheme: UNIX UFS 4K bytes per block, 32-bit addresses Operating System Concepts Essentials – 2nd Edition 11.37 Silberschatz, Galvin and Gagne ©2013 Performance Best method depends on file access type Contiguous great for sequential and random Linked good for sequential, not random Declare access type at creation Either contiguous or linked Indexed more complex Single block access could require 2 index block reads then data block read Clustering can help improve throughput, reduce CPU overhead Operating System Concepts Essentials – 2nd Edition 11.38 Silberschatz, Galvin and Gagne ©2013 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Operating System Concepts Essentials – 2nd Edition 11.39 Silberschatz, Galvin and Gagne ©2013 Free-Space Management File system maintains free-space list to track available blocks/clusters (Use term “block” for simplicity) Bit vector or bit map (n blocks) 0 1 2 n-1 … bit[i] = Operating System Concepts Essentials – 2nd Edition 1 block[i] free 0 block[i] occupied 11.40 Silberschatz, Galvin and Gagne ©2013 Free-Space Management (Cont.) Bit map requires extra space, e.g., Block size = 4KB = 212 bytes Disk size = 240 bytes (1 terabyte) n = 240/212 = 228 bits (or 32MB) If use clusters of 4 blocks -> 8MB of memory Easy to get contiguous files Operating System Concepts Essentials – 2nd Edition 11.41 Silberschatz, Galvin and Gagne ©2013 Linked Free Space List on Disk • Linked list (free list) Cannot get contiguous space easily No wasted space No need to traverse entire list Operating System Concepts Essentials – 2nd Edition 11.42 Silberschatz, Galvin and Gagne ©2013 Free-Space Management (Cont.) Grouping Modify linked list to store address of next n-1 free blocks in first free block plus pointer to next block that contains free-blockpointers Operating System Concepts Essentials – 2nd Edition 11.43 Silberschatz, Galvin and Gagne ©2013 Free-Space Management (Cont.) Counting Keep address of first free block and count of following free blocks Free space list: entries containing addresses and counts Operating System Concepts Essentials – 2nd Edition 11.44 Silberschatz, Galvin and Gagne ©2013 Free-Space Management (Cont.) Space Maps (used in ZFS) Consider meta-data I/O on very large file systems Full data structures like bit maps couldn’t fit in memory Require thousands of I/Os Divides device space into metaslab units, manages metaslabs Given volume can contain hundreds of metaslabs Each metaslab has associated space map Uses counting algorithm Operating System Concepts Essentials – 2nd Edition 11.45 Silberschatz, Galvin and Gagne ©2013 Free-Space Management (Cont.) Space Maps (continued) Records to log file rather than file system Log of all block activity, in time order, in counting format Metaslab activity = load space map into memory in balanced-tree structure, indexed by offset Replay log into that structure Combine contiguous free blocks into single entry Write structure back to disk (eventually, at once) Operating System Concepts Essentials – 2nd Edition 11.46 Silberschatz, Galvin and Gagne ©2013 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Operating System Concepts Essentials – 2nd Edition 11.47 Silberschatz, Galvin and Gagne ©2013 Efficiency and Performance Efficiency dependent on: Disk allocation and directory algorithms Types of data kept in file’s directory entry Pre-allocation or as-needed allocation of metadata structures Fixed-size or varying-size data structures Operating System Concepts Essentials – 2nd Edition 11.48 Silberschatz, Galvin and Gagne ©2013 Efficiency and Performance (Cont.) Performance Keep data and metadata close together (less seek) Buffer cache – separate section of main memory for frequently used blocks Synchronous writes sometimes requested by apps or needed by OS buffering / caching – writes must hit disk before acknowledgement No Asynchronous writes more common, buffer-able, faster Reads frequently slower than writes OS must block, wait for read from disk Operating System Concepts Essentials – 2nd Edition 11.49 Silberschatz, Galvin and Gagne ©2013 Page Cache Page cache caches pages rather than disk blocks using virtual memory techniques and addresses Memory-mapped I/O uses page cache Routine I/O through file system uses buffer (disk) cache Operating System Concepts Essentials – 2nd Edition 11.50 Silberschatz, Galvin and Gagne ©2013 I/O Without a Unified Buffer Cache Operating System Concepts Essentials – 2nd Edition 11.51 Silberschatz, Galvin and Gagne ©2013 Unified Buffer Cache Unified buffer cache uses same page cache to cache both memory-mapped pages and ordinary file system I/O Avoids double caching But which caches get priority, and what replacement algorithms to use? Operating System Concepts Essentials – 2nd Edition 11.52 Silberschatz, Galvin and Gagne ©2013 I/O Using a Unified Buffer Cache Operating System Concepts Essentials – 2nd Edition 11.53 Silberschatz, Galvin and Gagne ©2013 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Operating System Concepts Essentials – 2nd Edition 11.54 Silberschatz, Galvin and Gagne ©2013 Recovery Consistency checking – compares data in directory structure with data blocks on disk, tries to fix inconsistencies Can be slow and sometimes fails Use system programs to back up data from disk to another storage device (magnetic tape, other magnetic disk, optical) Recover lost file or disk by restoring data from backup Operating System Concepts Essentials – 2nd Edition 11.55 Silberschatz, Galvin and Gagne ©2013 Log Structured File Systems Log structured (or journaling) file systems record each metadata update to file system as a transaction All transactions written to log Transaction considered committed once it is written to log (sequentially) Sometimes on a separate device or section of disk File System may not yet be updated Operating System Concepts Essentials – 2nd Edition 11.56 Silberschatz, Galvin and Gagne ©2013 Log Structured File Systems Transactions in log are asynchronously written to file system structures When file system structures modified, transaction removed from log If file system crashes, all remaining transactions in log must still be performed Faster recovery from crash, removes chance of inconsistency of metadata Operating System Concepts Essentials – 2nd Edition 11.57 Silberschatz, Galvin and Gagne ©2013