2013-05-17 CS 134: Operating Systems File System Implementation 1 / 34 CS34 CS 134: Operating Systems File System Implementation 2013-05-17 Overview Implementation Issues Caching Failures Disk Scheduling API What Goes in an API? Two Strange APIs Consistency Other File Operations 2 / 34 CS34 Overview Implementation Issues Caching Failures Disk Scheduling Overview API What Goes in an API? Two Strange APIs Consistency Other File Operations Implementation Issues Caching 2013-05-17 Disk Caching—Class Exercise CS34 Implementation Issues Caching Disk Caching—Class Exercise There’s a close relationship to paging algorithms. If OS caches blocks of a file in memory, I How should it track what it’s caching? I How should it decide what to cache? 3 / 34 Disk Caching—Class Exercise If OS caches blocks of a file in memory, I How should it track what it’s caching? I How should it decide what to cache? Implementation Issues Caching 2013-05-17 Disk Buffering Copy data into a buffer I Write out to the disk Disk Buffering Allow writes to return immediately Caching Disk Buffering I Copy data into a buffer I Write out to the disk Buffers vs. Caches Which do we actually need? We need some kind of buffer (the write may not be the same size as the block). We don’t need the cache; it just improves performance. Allow writes to return immediately I CS34 Implementation Issues Buffers vs. Caches Which do we actually need? 4 / 34 Implementation Issues Caching 2013-05-17 Buffer Cache Store disk blocks waiting to write in buffer cache CS34 Implementation Issues Caching Buffer Cache Buffer Cache Store disk blocks waiting to write in buffer cache I “Free” buffers used as cache for blocks recently read I Dirty buffers will eventually be written to disk I Allow buffer cache to use any free memory in the machine I Ensure that a certain number of buffers are always available Class Exercises & Reminders When should we write the dirty buffers? Per-process or system-wide? Remind you of anything? I “Free” buffers used as cache for blocks recently read I Dirty buffers will eventually be written to disk I Allow buffer cache to use any free memory in the machine I Ensure that a certain number of buffers are always available Class Exercises & Reminders When should we write the dirty buffers? Per-process or system-wide? Remind you of anything? 5 / 34 Implementation Issues Caching 2013-05-17 Buffer Cache (cont.) One buffer cache for all processes and all block devices I Local disks I Remote disks I Tapes, CD-ROMs, etc. 6 / 34 CS34 Implementation Issues Caching Buffer Cache (cont.) Buffer Cache (cont.) One buffer cache for all processes and all block devices I Local disks I Remote disks I Tapes, CD-ROMs, etc. Implementation Issues Failures 2013-05-17 Dealing with System Failure—Class Exercise Extending the file by one block. . . In what order should we update the structures on disk to cause minimum damage if the system crashes? (Assume disk will never leave a block half-written. . . ) What about I File creation? I File deletion? 7 / 34 CS34 Implementation Issues Failures Dealing with System Failure—Class Exercise Dealing with System Failure—Class Exercise Extending the file by one block. . . In what order should we update the structures on disk to cause minimum damage if the system crashes? (Assume disk will never leave a block half-written. . . ) What about I File creation? I File deletion? Implementation Issues Failures 2013-05-17 Recovering from System Failure Before system failure. . . I Backups! After system failure. . . I Run consistency checker—compares data in directory structure with data blocks on disk, tries to fix inconsistencies. I Recover lost files (or disk) by restoring data from backup 8 / 34 CS34 Implementation Issues Failures Recovering from System Failure Recovering from System Failure Before system failure. . . I Backups! After system failure. . . I Run consistency checker—compares data in directory structure with data blocks on disk, tries to fix inconsistencies. I Recover lost files (or disk) by restoring data from backup Implementation Issues Disk Scheduling 2013-05-17 Disk Scheduling OS needs to use all I/O devices efficiently: I Minimize access time—composed of I I I Maximize disk bandwidth I I I Seek time Rotational latency Bandwidth = Total Bytes Transferred / Total Time Taken Usually, disk requests can be re-ordered Several algorithms exist to schedule disk I/O requests I Consider, e.g., cylinders 98, 183, 37, 122, 14, 124, 65, 67 and an initial head position of 53 9 / 34 CS34 Implementation Issues Disk Scheduling Disk Scheduling OS needs to use all I/O devices efficiently: I Minimize access time—composed of I I Maximize disk bandwidth I Usually, disk requests can be re-ordered Several algorithms exist to schedule disk I/O requests I Disk Scheduling Seek time Rotational latency I I I Bandwidth = Total Bytes Transferred / Total Time Taken Consider, e.g., cylinders 98, 183, 37, 122, 14, 124, 65, 67 and an initial head position of 53 Implementation Issues Disk Scheduling 2013-05-17 First-Come First-Served (FCFS) Handle request queue in order. . . CS34 Implementation Issues First-Come First-Served (FCFS) Handle request queue in order. . . Disk Scheduling First-Come First-Served (FCFS) Total head movement of 640 cylinders—Yuck! Total head movement of 640 cylinders—Yuck! 10 / 34 Implementation Issues Disk Scheduling 2013-05-17 Shortest Seek Time First (SSTF) Service request with minimum seek time from current head position CS34 Implementation Issues Shortest Seek Time First (SSTF) Service request with minimum seek time from current head position Disk Scheduling Shortest Seek Time First (SSTF) Total head movement of 236 cylinders SSTF may starve requests (why?). Total head movement of 236 cylinders 11 / 34 Implementation Issues Disk Scheduling 2013-05-17 Scan (aka The Elevator Algorithm) Move head from one end of the disk to the other, servicing requests as we go CS34 Implementation Issues Scan (aka The Elevator Algorithm) Move head from one end of the disk to the other, servicing requests as we go Disk Scheduling Scan (aka The Elevator Algorithm) Total head movement of 208 cylinders Total head movement of 208 cylinders 12 / 34 Implementation Issues Disk Scheduling 2013-05-17 Look Like Scan, but only go as far as least/greatest request. . . CS34 Implementation Issues Look Like Scan, but only go as far as least/greatest request. . . Disk Scheduling Look Total head movement of 180 cylinders Total head movement of 180 cylinders 13 / 34 Implementation Issues Disk Scheduling 2013-05-17 Circular-SCAN (C-Scan) Like Scan, but only move in one direction. . . CS34 Implementation Issues Circular-SCAN (C-Scan) Like Scan, but only move in one direction. . . Disk Scheduling Circular-SCAN (C-Scan) Total head movement of 382 cylinders Why do this? It wastes head movement. . . Total head movement of 382 cylinders 14 / 34 Implementation Issues Disk Scheduling 2013-05-17 C-Look The circular variant of Look (again, only move in one direction) CS34 Implementation Issues C-Look The circular variant of Look (again, only move in one direction) Disk Scheduling C-Look Total head movement of 322 cylinders Total head movement of 322 cylinders 15 / 34 Implementation Issues Disk Scheduling 2013-05-17 Fairness What if we had two processes, each producing the following requests I 0, 5, 10, . . . , 75 I 125, 130, 135, . . . , 200 How fair would these techniques be? 16 / 34 CS34 Implementation Issues Disk Scheduling Fairness Fairness What if we had two processes, each producing the following requests I 0, 5, 10, . . . , 75 I 125, 130, 135, . . . , 200 How fair would these techniques be? Implementation Issues Disk Scheduling 2013-05-17 Fairness—Look 17 / 34 CS34 Implementation Issues Disk Scheduling Fairness—Look Fairness—Look Implementation Issues Disk Scheduling 2013-05-17 Fairness—FCFS 18 / 34 CS34 Implementation Issues Disk Scheduling Fairness—FCFS Fairness—FCFS Implementation Issues Disk Scheduling 2013-05-17 Fairness—FScan / N-step-Scan 19 / 34 CS34 Implementation Issues Disk Scheduling Fairness—FScan / N-step-Scan Fairness—FScan / N-step-Scan Implementation Issues Disk Scheduling 2013-05-17 Selecting a Disk-Scheduling Algorithm Comparing algorithms, we find that: I FCFS can be okay if disk controller manages the scheduling (many do these days) I SSTF is common and has natural appeal I LOOK works well (lowest amount of head moment in our test) I LOOK and C-LOOK perform better for systems under heavy load 20 / 34 CS34 Implementation Issues Disk Scheduling Selecting a Disk-Scheduling Algorithm Selecting a Disk-Scheduling Algorithm Comparing algorithms, we find that: I FCFS can be okay if disk controller manages the scheduling (many do these days) I SSTF is common and has natural appeal I LOOK works well (lowest amount of head moment in our test) I LOOK and C-LOOK perform better for systems under heavy load Implementation Issues Disk Scheduling 2013-05-17 Modern Disk Geometry Disk Scheduling Modern Disk Geometry Modern Disk Geometry Modern disks maximize utilization by varying the number of sectors per cylinder I Logical block→(Sector, Track/Cylinder)? I Complex or impossible for operating system to calculate! Sometimes tracks are even laid out in a zig-zag pattern Consequences? Disk scheduling becomes impossible to perform perfectly. But approximations work reasonably well. Or we can just hand it off to the disk (modern disks are smart). But careful placement of vital data across platters may not work out. Modern disks maximize utilization by varying the number of sectors per cylinder I Logical block→(Sector, Track/Cylinder)? I CS34 Implementation Issues Complex or impossible for operating system to calculate! Sometimes tracks are even laid out in a zig-zag pattern Consequences? 21 / 34 API What Goes in an API? 2013-05-17 File-Access API Class Exercise Describe & Develop Basic requirements for a file access API I A stateful interface that satisfies those requirements I A stateless interface that satisfies them File-Access API Class Exercise What Goes in an API? File-Access API What operations should we provide for accessing files? Describe & Develop I Basic requirements for a file access API I A stateful interface that satisfies those requirements I A stateless interface that satisfies them This is a lengthy exercise; they should divide into groups and do it on paper. What operations should we provide for accessing files? I CS34 API 22 / 34 API What Goes in an API? 2013-05-17 Stateful File Access OS maintains some “context” for each open file. . . File access I handle = open(filename, mode) File management I info(name, info) I read(handle, length, buffer) I delete(name) I write(handle, length, buffer) I change_directory(dirname) I truncate(handle, length) I create_dir(dirname) I seek(handle, position) I move(name, name) I close(handle) 23 / 34 CS34 API Stateful File Access OS maintains some “context” for each open file. . . What Goes in an API? Stateful File Access File access I handle = open(filename, mode) File management I info(name, info) I read(handle, length, buffer) I delete(name) I write(handle, length, buffer) I change_directory(dirname) I truncate(handle, length) I create_dir(dirname) I seek(handle, position) I move(name, name) I close(handle) API What Goes in an API? 2013-05-17 Stateless File Access No (apparent) internal state—each system call fully describes the desired operation File access I I I I create(filename) read(filename, pos, length, buffer) write(filename, pos, length, buffer) truncate(filename, length) CS34 API Stateless File Access No (apparent) internal state—each system call fully describes the desired operation What Goes in an API? Stateless File Access File access I create(filename) I read(filename, pos, length, buffer) I write(filename, pos, length, buffer) I truncate(filename, length) File management I info(name, info) I delete(name) I create_dir(dirname) I move(name, name) Class Exercise Contrast these two approaches. . . Completeness: Each method can simulate the other Stateless operation: File management • Simple I info(name, info) I delete(name) • Works well in a multi-threaded program • Basis for NFS I create_dir(dirname) I move(name, name) Stateful operation: • Assumes file-locality and sequential access is common Class Exercise • Provides the operating system with more information about which files are being used Contrast these two approaches. . . • Maps well to other kinds of device besides files (e.g., read/write to a terminal) • May add arbitrary limitations (e.g, maximum open files) 24 / 34 API Two Strange APIs I Files divided into variable-length “lines” I I I I I 2013-05-17 Michigan Terminal System (60’s) Even binary files made up of lines Each line numbered (fixed-point, 6 fractional decimal digits) Could read by line number or by “next line” Could write by line number (inserting in middle if appropriate) or (?) just append at end No user access to devices I I E.g., print by creating file, then handing to OS Slightly problematic when terminals introduced. . . 25 / 34 CS34 API Michigan Terminal System (60’s) I Files divided into variable-length “lines” I No user access to devices I Two Strange APIs I I I Michigan Terminal System (60’s) I I Even binary files made up of lines Each line numbered (fixed-point, 6 fractional decimal digits) Could read by line number or by “next line” Could write by line number (inserting in middle if appropriate) or (?) just append at end E.g., print by creating file, then handing to OS Slightly problematic when terminals introduced. . . API Two Strange APIs I “inodes” exposed to application I Directories identified by OS but accessed with file API I OS responsible for allocating blocks upon request I Complex read/write interface (asynchronous, fixed/variable records, devices treated separately) I Much of file system in library 2013-05-17 RSX-11M/VMS (70’s) 26 / 34 CS34 API RSX-11M/VMS (70’s) Two Strange APIs RSX-11M/VMS (70’s) I “inodes” exposed to application I Directories identified by OS but accessed with file API I OS responsible for allocating blocks upon request I Complex read/write interface (asynchronous, fixed/variable records, devices treated separately) I Much of file system in library API Consistency 2013-05-17 Consistency Model Additional complications: I Multiple processes can access files —Two (or more) processes could read and write same file I Asynchronous writes + errors = ? I What if file is moved/renamed/deleted while a process is using it? What should the rules be? 27 / 34 CS34 API Consistency Model Additional complications: Consistency Consistency Model I Multiple processes can access files —Two (or more) processes could read and write same file I Asynchronous writes + errors = ? I What if file is moved/renamed/deleted while a process is using it? What should the rules be? API Consistency 2013-05-17 Class Exercise Develop and justify a consistency model for file operations. 28 / 34 CS34 API Class Exercise Consistency Class Exercise Develop and justify a consistency model for file operations. API Consistency 2013-05-17 Class Exercise Develop and justify a consistency model for file operations. Develop another one. 28 / 34 CS34 API Class Exercise Consistency Class Exercise Develop and justify a consistency model for file operations. Develop another one. API Consistency 2013-05-17 Unix Consistency Model Unix uses the following rules: CS34 API Unix Consistency Model Unix uses the following rules: Consistency Unix Consistency Model I All file operations are globally atomic I File is only deleted when its link (name) count is zero—an open counts as a link I write in one process is globally visible immediately afterwards I writes are asynchronous—I/O errors may not be discovered until the file is closed These rules do not map well onto a stateless I/O interface. I All file operations are globally atomic I File is only deleted when its link (name) count is zero—an open counts as a link I write in one process is globally visible immediately afterwards I writes are asynchronous—I/O errors may not be discovered until the file is closed 29 / 34 API Consistency 2013-05-17 NFSv2 Consistency Model Classic NFS uses the following rules: I All file operations are globally atomic and stateless I move/rename/delete can disrupt accesses performed by other processes I write in one process is globally visible immediately afterwards I writes are synchronous—I/O errors are discovered immediately CS34 API NFSv2 Consistency Model Classic NFS uses the following rules: Consistency NFSv2 Consistency Model I All file operations are globally atomic and stateless I move/rename/delete can disrupt accesses performed by other processes I write in one process is globally visible immediately afterwards I writes are synchronous—I/O errors are discovered immediately These rules map well onto a stateless I/O interface, but are not entirely consistent with the POSIX file model. 30 / 34 API Consistency 2013-05-17 File Locking What if we want to lock sections of a file? 31 / 34 CS34 API File Locking Consistency File Locking What if we want to lock sections of a file? API Other File Operations 2013-05-17 File Access—Leveraging the VM System Use virtual memory system to provide file access I “Map” file into memory I Page faults retrieve file’s data from disk I Provide copy-on-write or writeback semantics Add the following system calls: I map_file(handle, size, mode, address) I unmap_file(address) (This mechanism doesn’t allow extending or truncating the file.) 32 / 34 CS34 API File Access—Leveraging the VM System Use virtual memory system to provide file access Other File Operations File Access—Leveraging the VM System I “Map” file into memory I Page faults retrieve file’s data from disk I Provide copy-on-write or writeback semantics Add the following system calls: I map_file(handle, size, mode, address) I unmap_file(address) (This mechanism doesn’t allow extending or truncating the file.) API Other File Operations 2013-05-17 More File Access Besides calls for protection, there are other system calls we might want: I Control owner, permissions, etc. I Control file caching I Find (remaining) capacity of the disk I Eject disk I Discover whether any reads/writes have failed I ... 33 / 34 CS34 API More File Access Other File Operations More File Access Besides calls for protection, there are other system calls we might want: I Control owner, permissions, etc. I Control file caching I Find (remaining) capacity of the disk I Eject disk I Discover whether any reads/writes have failed I ... API Other File Operations 2013-05-17 Generic Mechanisms CS34 API Generic Mechanisms Other File Operations Generic Mechanisms Two approaches I ioctl(handle, request, buffer ) I Pseudo-files and pseudo-filesystems Ioctl is clumsy, hard to use, prone to inconsistency. Pseudo-files aren’t good for adding one new feature (such as locking). Two approaches I ioctl(handle, request, buffer ) I Pseudo-files and pseudo-filesystems 34 / 34