TDDB63: Concurrent programming and operating systems Agenda [SGG7] Chapter 10 and 11 • File-System Interface and Implementation Explain the function of file systems Describe interfaces to file systems Discuss file system design and protection Describe implementation of local and remote file systems Discuss block allocation algorithms File-System Interface + File Concept + Access Methods + Directory Structure + File-System Mounting + File Sharing + Protection • File-System Implementation + File-System Structure + File-System Implementation + Directory Implementation + Allocation Methods + Free-Space Management + Efficiency and Performance + Recovery + Log-Structured File Systems Copyright Notice: The lecture notes are mainly based on Silberschatz’s, Galvin’s and Gagne’s book (“Operating System Concepts”, 7th ed., Wiley, 2005). No part of the lecture notes may be reproduced in any form, due to the copyrights reserved by Addison-Wesley. These lecture notes should only be used for internal teaching purposes at the Linköping University. Andrzej Bednarski, IDA Linköpings universitet, 2005 TDDB63, A. Bednarski, IDA, Linköpings universitet File Concept File Structure • • • • Contiguous logical address space Types: + Data ̶ numeric ̶ character ̶ binary + Program • • • TDDB63, A. Bednarski, IDA, Linköpings universitet 9.3 Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet File Operations • • • • • • • • • • • • • • TDDB63, A. Bednarski, IDA, Linköpings universitet 9.5 Silberschatz, Galvin and Gagne ©2005 • Silberschatz, Galvin and Gagne ©2005 None - sequence of words, bytes Simple record structure + Lines + Fixed length + Variable length Complex Structures + Formatted document + Relocatable load file Can simulate last two with first method by inserting appropriate control characters. Who decides: + Operating system + Program File Attributes Name – only information kept in human-readable form. Type – needed for systems that support different types. Location – pointer to file location on device. Size – current file size. Protection – controls who can do reading, writing, executing. Time, date, and user identification – data for protection, security, and usage monitoring. Information about files are kept in the directory structure, which is maintained on the disk. 9.2 9.4 Silberschatz, Galvin and Gagne ©2005 create write read reposition within file – file seek delete truncate open(Fi) – search the directory structure on disk for entry Fi, and move the content of entry to memory. close (Fi) – move the content of entry Fi in memory to directory structure on disk. TDDB63, A. Bednarski, IDA, Linköpings universitet 9.6 Silberschatz, Galvin and Gagne ©2005 1 Open Files • File Types – Name, Extension Several pieces of data are needed to manage open files: + File pointer: pointer to last read/write location, per process that has the file open + File-open count: counter of number of times a file is open – to allow removal of data from open-file table when last processes closes it + Disk location of the file: cache of data access information + Access rights: per-process access mode information TDDB63, A. Bednarski, IDA, Linköpings universitet 9.7 Silberschatz, Galvin and Gagne ©2005 Access Methods • TDDB63, A. Bednarski, IDA, Linköpings universitet Silberschatz, Galvin and Gagne ©2005 Directory Structure • Sequential Access read next write next reset no read after last write (rewrite) • 9.8 A collection of nodes containing information about all files. Directory Direct Access read n write n position to n read next write next rewrite n Files • • n = relative block number TDDB63, A. Bednarski, IDA, Linköpings universitet 9.9 Silberschatz, Galvin and Gagne ©2005 F1 F2 F3 F4 Fn Both the directory structure and the files reside on disk. Backups of these two structures are kept on tapes. TDDB63, A. Bednarski, IDA, Linköpings universitet 9.10 Silberschatz, Galvin and Gagne ©2005 Information in a Device Directory Operations Performed on Directory • • • • • • • • • • • • • • • Name Type Address Current length Maximum length Date last accessed (for archival) Date last updated (for dump) Owner ID (who pays) Protection information (discuss later) TDDB63, A. Bednarski, IDA, Linköpings universitet 9.11 Silberschatz, Galvin and Gagne ©2005 Search for a file Create a file Delete a file List a directory Rename a file Traverse the file system TDDB63, A. Bednarski, IDA, Linköpings universitet 9.12 Silberschatz, Galvin and Gagne ©2005 2 Organize the Directory (Logically) • • • Single-Level Directory • Efficiency – locating a file quickly Naming – convenient to users + Two users can have same name for different files + The same file can have several different names Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …) A single directory for all users Naming problem Grouping problem TDDB63, A. Bednarski, IDA, Linköpings universitet 9.13 Silberschatz, Galvin and Gagne ©2005 Two-Level Directory • TDDB63, A. Bednarski, IDA, Linköpings universitet 9.14 Silberschatz, Galvin and Gagne ©2005 Tree-Structured Directories Separate directory for each user Path name Can have the same file name for different user Efficient searching No grouping capability TDDB63, A. Bednarski, IDA, Linköpings universitet 9.15 Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet 9.16 Silberschatz, Galvin and Gagne ©2005 Tree-Structured Directories (Cont.) Tree-Structured Directories (Cont.) • • • • • • Efficient searching Grouping Capability Current directory (working directory) + cd /spell/mail/prog + type list • Absolute or relative path name Creating a new file is done in current directory. Delete a file rm <file-name> Creating a new subdirectory is done in current directory. mkdir <dir-name> Example: if in current directory /spell/mail mkdir count mail prog • TDDB63, A. Bednarski, IDA, Linköpings universitet 9.17 Silberschatz, Galvin and Gagne ©2005 copy prt exp count Deleting “mail” ⇒ deleting the entire subtree rooted by “mail”. TDDB63, A. Bednarski, IDA, Linköpings universitet 9.18 Silberschatz, Galvin and Gagne ©2005 3 Acyclic-Graph Directories • Acyclic-Graph Directories (Cont.) Have shared subdirectories and files • Two different names (aliasing) • If dict deletes list ⇒ dangling pointer Solutions: + Backpointers, so we can delete all pointers + Backpointers using a daisy chain organization + Entry-hold-count solution New directory entry type + Link – another name (pointer) to an existing file + Resolve the link – follow pointer to locate the file • TDDB63, A. Bednarski, IDA, Linköpings universitet 9.19 Silberschatz, Galvin and Gagne ©2005 General Graph Directory TDDB63, A. Bednarski, IDA, Linköpings universitet 9.21 Silberschatz, Galvin and Gagne ©2005 General Graph Directory (Cont.) • TDDB63, A. Bednarski, IDA, Linköpings universitet 9.20 Silberschatz, Galvin and Gagne ©2005 How do we guarantee no cycles? + Allow only links to file not subdirectories + Garbage collection + Every time a new link is added use a cycle detection algorithm to determine whether it is OK TDDB63, A. Bednarski, IDA, Linköpings universitet 9.22 Silberschatz, Galvin and Gagne ©2005 File System Mounting File Sharing • • • Sharing of files on multi-user systems is desirable + (Control version software: CVS, SVN, CADESE, …) • Sharing may be done through a protection scheme + User IDs identify users, allowing permissions and protections to be per-user + Group IDs allow users to be in groups, permitting group access rights • On distributed systems, files may be shared across a network A file system must be mounted before it can be used Mounting + Provide device name to OS + Provide mount point + OS verifies validity of file system (device directory) + OS updates its directory structure TDDB63, A. Bednarski, IDA, Linköpings universitet 9.23 Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet 9.24 Silberschatz, Galvin and Gagne ©2005 4 File Sharing – Remote File Systems File Sharing – Consistency Semantics • • • • Uses networking to allow file system access between systems + Manually via programs like FTP + Automatically, seamlessly using distributed file systems + Semi automatically via the world wide web Client-server model allows clients to mount remote file systems from servers + Server can serve multiple clients + Client and user-on-client identification is insecure or complicated + NFS is standard UNIX client-server file sharing protocol + CIFS is standard Windows protocol + Standard operating system file calls are translated into remote calls Distributed Information Systems (distributed naming services) such as LDAP, DNS, NIS, Active Directory implement unified access to information needed for remote computing TDDB63, A. Bednarski, IDA, Linköpings universitet 9.25 Silberschatz, Galvin and Gagne ©2005 Protection • • Consistency semantics specify how multiple users are to access a shared file simultaneously + Similar to process synchronization algorithms + Unix file system (UFS) implements: ̶ Writes to an open file visible immediately to other users of the same open file ̶ Sharing file pointer to allow multiple users to read and write concurrently + AFS has session semantics ̶ Writes only visible to sessions starting after the file is closed TDDB63, A. Bednarski, IDA, Linköpings universitet 9.26 Silberschatz, Galvin and Gagne ©2005 Access Lists and Groups • • File owner/creator should be able to control: + what can be done + by whom Types of access + Read + Write + Execute + Append + Delete + List Mode of access: read, write, execute Three classes of users RWX 111 RWX b) groups access 6 ⇒ 110 RWX c) public access 1 ⇒ 001 Ask manager to create a group (unique name), say G, and add some users to the group. For a particular file (say game) or subdirectory, define an appropriate access: a) owner access • • 7 ⇒ owner • TDDB63, A. Bednarski, IDA, Linköpings universitet 9.27 Silberschatz, Galvin and Gagne ©2005 Attach a group to a file $ chgrp G game TDDB63, A. Bednarski, IDA, Linköpings universitet 9.28 group chmod 761 public game Silberschatz, Galvin and Gagne ©2005 File-System Implementation File-System Structure • • • • • • • • • File structure + Logical storage unit + Collection of related information • File system resides on secondary storage (disks) • File system organized into layers • File control block storage structure consisting of information about a file File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Log-Structured File Systems TDDB63, A. Bednarski, IDA, Linköpings universitet 9.29 Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet 9.30 Silberschatz, Galvin and Gagne ©2005 5 In-Memory File System Structures TDDB63, A. Bednarski, IDA, Linköpings universitet 9.31 Silberschatz, Galvin and Gagne ©2005 Directory Implementation • Linear list of file names with pointer to the data blocks. + simple to program + time-consuming to execute • Hash Table – linear list with hash data structure. + decreases directory search time + collisions – situations where two file names hash to the same location + fixed size TDDB63, A. Bednarski, IDA, Linköpings universitet 9.32 Silberschatz, Galvin and Gagne ©2005 Allocation Methods Contiguous Allocation • • Each file occupies a set of contiguous blocks on the disk. • Simple – only starting location (block #) and length (number of blocks) are required. • Random access. • Wasteful of space (dynamic storage-allocation problem). • Files cannot grow. • Mapping from logical to physical. An allocation method refers to how disk blocks are allocated for files: + Contiguous allocation + Linked allocation + Indexed allocation Q LA/512 R + Block to be accessed = Q + starting address + Displacement into block = R TDDB63, A. Bednarski, IDA, Linköpings universitet 9.33 Silberschatz, Galvin and Gagne ©2005 Contiguous Allocation of Disk Space TDDB63, A. Bednarski, IDA, Linköpings universitet 9.34 Silberschatz, Galvin and Gagne ©2005 Linked Allocation • Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. Block = Pointer Data TDDB63, A. Bednarski, IDA, Linköpings universitet 9.35 Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet 9.36 Silberschatz, Galvin and Gagne ©2005 6 Linked Allocation (Cont.) Indexed Allocation • • • • • • Simple – need only starting address Free-space management system – no waste of space No random access File-allocation table (FAT) Mapping disk-space allocation used by Q MS-DOS and OS/2. LA/511 R + Block to be accessed is the Qth block in the linked chain of blocks representing the file. + Displacement into block = R + 1 TDDB63, A. Bednarski, IDA, Linköpings universitet Brings all pointers together into the index block. Logical view. index table 9.37 Silberschatz, Galvin and Gagne ©2005 Example of Indexed Allocation TDDB63, A. Bednarski, IDA, Linköpings universitet 9.38 Silberschatz, Galvin and Gagne ©2005 Indexed Allocation (Cont.) • • • • Need index table Random access Dynamic access without external fragmentation, but have overhead of index block. Mapping from logical to physical in a file of maximum size of 256K words and block size of 512 words. We need only 1 block for index table. Q LA/512 R Q = displacement into index table R = displacement into block TDDB63, A. Bednarski, IDA, Linköpings universitet 9.39 Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet 9.40 Silberschatz, Galvin and Gagne ©2005 Indexed Allocation – Mapping (Cont.) Indexed Allocation – Mapping (Cont.) • • • Mapping from logical to physical in a file of unbounded length (block size of 512 words). Linked scheme – link blocks of index table (no limit on size). Two-level index (maximum file size is 5123) Q1 LA / (512 x 512) R1 Q1 LA / (512 x 511) R1 Q1 = block of index table R1 is used as follows: Q1 = displacement into outer-index R1 is used as follows: Q2 R1 / 512 R2 Q2 = displacement into block of index table R2 displacement into block of file: TDDB63, A. Bednarski, IDA, Linköpings universitet 9.41 Q2 R1 / 512 R2 Q2 = displacement into block of index table R2 displacement into block of file: Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet 9.42 Silberschatz, Galvin and Gagne ©2005 7 Combined Scheme: UNIX (4K bytes per block) Free-Space Management • Bit vector (n blocks) 0 1 2 n-1 bit[i] = 678 … 0 ⇒ block[i] free 1 ⇒ block[i] occupied Block number calculation (number of bits per word) * (number of 0-value words) + offset of first 1 bit TDDB63, A. Bednarski, IDA, Linköpings universitet 9.43 Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet 9.44 Free-Space Management (Cont.) Free-Space Management (Cont.) • • • • • • Bit map requires extra space + Example: block size = 212 bytes disk size = 230 bytes (1 gigabyte) n = 230/212 = 218 bits (or 32K bytes) Easy to get contiguous files Linked list (free list) + Cannot get contiguous space easily + No waste of space Grouping Counting TDDB63, A. Bednarski, IDA, Linköpings universitet 9.45 Silberschatz, Galvin and Gagne ©2005 Silberschatz, Galvin and Gagne ©2005 Need to protect: + Pointer to free list + Bit map ̶ Must be kept on disk ̶ Copy in memory and disk may differ ̶ Cannot allow for block[i] to have a situation where bit[i ] = 1 in memory and bit[i ] = 0 on disk + Solution: ̶ Set bit[i ] = 1 in disk ̶ Allocate block[i ] ̶ Set bit[i ] = 1 in memory TDDB63, A. Bednarski, IDA, Linköpings universitet 9.46 Silberschatz, Galvin and Gagne ©2005 Directory Implementation Efficiency and Performance • Linear list of file names with pointer to the data blocks. + Simple to program + Time-consuming to execute • Efficiency dependent on: + Disk allocation and directory algorithms + Types of data kept in file’s directory entry • Performance + Disk cache separate section of main memory for frequently sued blocks + Free-behind and read-ahead techniques to optimize sequential access + Improve PC performance by dedicating section of memory as virtual disk, or RAM disk. • Hash Table – linear list with hash data structure. + Decreases directory search time + Collisions situations where two file names hash to the same location + Fixed size TDDB63, A. Bednarski, IDA, Linköpings universitet 9.47 Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet 9.48 Silberschatz, Galvin and Gagne ©2005 8 Redundant Arrays of Independent Disks (RAID) Page Cache • • • • A page cache caches pages rather than disk blocks using virtual memory techniques Memory-mapped I/O uses a page cache Routine I/O through the file system uses the buffer (disk) cache A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O • • • • • I/O using a unified buffer cache I/O without a unified buffer cache TDDB63, A. Bednarski, IDA, Linköpings universitet 9.49 Silberschatz, Galvin and Gagne ©2005 RAID – multiple disk drives provides reliability via redundancy. RAID is arranged into six different levels. Several improvements in disk-use techniques involve the use of multiple disks working cooperatively. Disk striping uses a group of disks as one storage unit. RAID schemes improve performance and improve reliability of the storage system by storing redundant data. + Mirroring or shadowing keeps duplicate of each disk. + Block interleaved parity uses much less redundancy. TDDB63, A. Bednarski, IDA, Linköpings universitet 9.50 Silberschatz, Galvin and Gagne ©2005 Recovery Log Structured File Systems • Consistency checking – compares data in directory structure with data blocks on disk, and tries to fix inconsistencies • Log structured (or journaling) file systems record each update to the file system as a transaction • Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape, magnetic disk, optical) • • Recover lost file or disk by restoring data from backup All transactions are written to a log + A transaction is committed once it is written to the log + However, the file system may not yet be updated • Transactions in the log are asynchronously written to the file system + When the file system is modified, the transaction is removed from the log • If the file system crashes, all remaining transactions in the log must still be performed TDDB63, A. Bednarski, IDA, Linköpings universitet 9.51 Silberschatz, Galvin and Gagne ©2005 TDDB63, A. Bednarski, IDA, Linköpings universitet 9.52 Silberschatz, Galvin and Gagne ©2005 Recommended Reading and Exercises • Reading: + [SGG7] Chapter 10 and 11 + Chapter 11 and 12 (sixth edition) • Exercises: + All TDDB63, A. Bednarski, IDA, Linköpings universitet 9.53 Silberschatz, Galvin and Gagne ©2005 9