TDDI12: Operating Systems, Real-time and Concurrent programming Agenda [SGG7] Chapter 10 and 11 • File-System Interface and Implementation Explain the function of file systems Describe interfaces to file systems Discuss file system design and protection Describe implementation of local and remote file systems Discuss block allocation algorithms File-System Interface + File Concept + Access Methods + Directory Structure + File-System Mounting + File Sharing + Protection File-System Implementation + File-System Structure + File-System Implementation + Directory Implementation + Allocation Methods + Free-Space Management + Efficiency and Performance + Recovery + Log-Structured File Systems • Copyright Notice: The lecture notes are mainly based on Silberschatz’s, Galvin’s and Gagne’s book (“Operating System Concepts”, 7th ed., Wiley, 2005). No part of the lecture notes may be reproduced in any form, due to the copyrights reserved by Addison-Wesley. These lecture notes should only be used for internal teaching purposes at the Linköping University. Andrzej Bednarski, IDA Linköpings universitet, 2006 File Concept • • TDDI12, A. Bednarski, IDA, Linköpings universitet • • • • • 9.3 Silberschatz, Galvin and Gagne ©2005 File Attributes • • • • • • • None - sequence of words, bytes Simple record structure + Lines + Fixed length + Variable length Complex Structures + Formatted document + Relocatable load file Can simulate last two with first method by inserting appropriate control characters. Who decides: + Operating system + Program TDDI12, A. Bednarski, IDA, Linköpings universitet 9.4 Silberschatz, Galvin and Gagne ©2005 File Operations Name – only information kept in human-readable form. Type – needed for systems that support different types. Location – pointer to file location on device. Size – current file size. Protection – controls who can do reading, writing, executing. Time, date, and user identification – data for protection, security, and usage monitoring. Information about files are kept in the directory structure, which is maintained on the disk. TDDI12, A. Bednarski, IDA, Linköpings universitet Silberschatz, Galvin and Gagne ©2005 File Structure Contiguous logical address space Types: + Data ̶ numeric ̶ character ̶ binary + Program TDDI12, A. Bednarski, IDA, Linköpings universitet 9.2 9.5 Silberschatz, Galvin and Gagne ©2005 • • • • • • • • create write read reposition within file – file seek delete truncate open(Fi) – search the directory structure on disk for entry Fi, and move the content of entry to memory. close (Fi) – move the content of entry Fi in memory to directory structure on disk. TDDI12, A. Bednarski, IDA, Linköpings universitet 9.6 Silberschatz, Galvin and Gagne ©2005 1 Open Files • File Types – Name, Extension Several pieces of data are needed to manage open files: + File pointer: pointer to last read/write location, per process that has the file open + File-open count: counter of number of times a file is open – to allow removal of data from open-file table when last processes closes it + Disk location of the file: cache of data access information + Access rights: per-process access mode information TDDI12, A. Bednarski, IDA, Linköpings universitet 9.7 Silberschatz, Galvin and Gagne ©2005 Access Methods TDDI12, A. Bednarski, IDA, Linköpings universitet Directory Direct Access read n write n position to n read next write next rewrite n Files TDDI12, A. Bednarski, IDA, Linköpings universitet • • 9.9 Silberschatz, Galvin and Gagne ©2005 Information in a Device Directory Name Type Address Current length Maximum length Date last accessed (for archival) Date last updated (for dump) Owner ID (who pays) Protection information (discuss later) TDDI12, A. Bednarski, IDA, Linköpings universitet F1 F2 F3 F4 Fn n = relative block number • • • • • • • • • A collection of nodes containing information about all files. Sequential Access read next write next reset no read after last write (rewrite) • Silberschatz, Galvin and Gagne ©2005 Directory Structure • • 9.8 9.11 TDDI12, A. Bednarski, IDA, Linköpings universitet 9.10 Silberschatz, Galvin and Gagne ©2005 Operations Performed on Directory • • • • • • Silberschatz, Galvin and Gagne ©2005 Both the directory structure and the files reside on disk. Backups of these two structures are kept on tapes. Search for a file Create a file Delete a file List a directory Rename a file Traverse the file system TDDI12, A. Bednarski, IDA, Linköpings universitet 9.12 Silberschatz, Galvin and Gagne ©2005 2 Organize the Directory (Logically) • • • Single-Level Directory • Efficiency – locating a file quickly Naming – convenient to users + Two users can have same name for different files + The same file can have several different names Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …) A single directory for all users Naming problem Grouping problem TDDI12, A. Bednarski, IDA, Linköpings universitet 9.13 Silberschatz, Galvin and Gagne ©2005 Two-Level Directory • 9.14 Silberschatz, Galvin and Gagne ©2005 Tree-Structured Directories Separate directory for each user • • • • Path name Can have the same file name for different user Efficient searching No grouping capability TDDI12, A. Bednarski, IDA, Linköpings universitet 9.15 Silberschatz, Galvin and Gagne ©2005 Tree-Structured Directories (Cont.) • • • TDDI12, A. Bednarski, IDA, Linköpings universitet Efficient searching Grouping Capability Current directory (working directory) + cd /spell/mail/prog + type list TDDI12, A. Bednarski, IDA, Linköpings universitet 9.16 Silberschatz, Galvin and Gagne ©2005 Tree-Structured Directories (Cont.) • • • • Absolute or relative path name Creating a new file is done in current directory. Delete a file rm <file-name> Creating a new subdirectory is done in current directory. mkdir <dir-name> Example: if in current directory /spell/mail mkdir count mail prog • TDDI12, A. Bednarski, IDA, Linköpings universitet 9.17 Silberschatz, Galvin and Gagne ©2005 copy prt exp count Deleting “mail” ⇒ deleting the entire subtree rooted by “mail”. TDDI12, A. Bednarski, IDA, Linköpings universitet 9.18 Silberschatz, Galvin and Gagne ©2005 3 Acyclic-Graph Directories • Acyclic-Graph Directories (Cont.) Have shared subdirectories and files • Two different names (aliasing) • If dict deletes list ⇒ dangling pointer Solutions: + Backpointers, so we can delete all pointers + Backpointers using a daisy chain organization + Entry-hold-count solution New directory entry type + Link – another name (pointer) to an existing file + Resolve the link – follow pointer to locate the file • TDDI12, A. Bednarski, IDA, Linköpings universitet 9.19 Silberschatz, Galvin and Gagne ©2005 General Graph Directory TDDI12, A. Bednarski, IDA, Linköpings universitet 9.21 Silberschatz, Galvin and Gagne ©2005 File System Mounting • • How do we guarantee no cycles? + Allow only links to file not subdirectories + Garbage collection + Every time a new link is added use a cycle detection algorithm to determine whether it is OK TDDI12, A. Bednarski, IDA, Linköpings universitet 9.22 Silberschatz, Galvin and Gagne ©2005 File Sharing A file system must be mounted before it can be used Mounting + Provide device name to OS + Provide mount point + OS verifies validity of file system (device directory) + OS updates its directory structure TDDI12, A. Bednarski, IDA, Linköpings universitet Silberschatz, Galvin and Gagne ©2005 General Graph Directory (Cont.) • TDDI12, A. Bednarski, IDA, Linköpings universitet 9.20 9.23 Silberschatz, Galvin and Gagne ©2005 • Sharing of files on multi-user systems is desirable + (Control version software: CVS, SVN, CADESE, …) • Sharing may be done through a protection scheme + User IDs identify users, allowing permissions and protections to be per-user + Group IDs allow users to be in groups, permitting group access rights • On distributed systems, files may be shared across a network TDDI12, A. Bednarski, IDA, Linköpings universitet 9.24 Silberschatz, Galvin and Gagne ©2005 4 File Sharing – Remote File Systems • • • Uses networking to allow file system access between systems + Manually via programs like FTP + Automatically, seamlessly using distributed file systems + Semi automatically via the world wide web Client-server model allows clients to mount remote file systems from servers + Server can serve multiple clients + Client and user-on-client identification is insecure or complicated + NFS is standard UNIX client-server file sharing protocol + CIFS is standard Windows protocol + Standard operating system file calls are translated into remote calls Distributed Information Systems (distributed naming services) such as LDAP, DNS, NIS, Active Directory implement unified access to information needed for remote computing TDDI12, A. Bednarski, IDA, Linköpings universitet 9.25 Silberschatz, Galvin and Gagne ©2005 Protection • • File Sharing – Consistency Semantics • Consistency semantics specify how multiple users are to access a shared file simultaneously + Similar to process synchronization algorithms + Unix file system (UFS) implements: ̶ Writes to an open file visible immediately to other users of the same open file ̶ Sharing file pointer to allow multiple users to read and write concurrently + AFS has session semantics ̶ Writes only visible to sessions starting after the file is closed TDDI12, A. Bednarski, IDA, Linköpings universitet 9.26 Silberschatz, Galvin and Gagne ©2005 Access Lists and Groups • • File owner/creator should be able to control: + what can be done + by whom Types of access + Read + Write + Execute + Append + Delete + List Mode of access: read, write, execute Three classes of users RWX 111 RWX b) groups access 6 ⇒ 110 RWX c) public access 1 ⇒ 001 Ask manager to create a group (unique name), say G, and add some users to the group. For a particular file (say game) or subdirectory, define an appropriate access: a) owner access • • 7 ⇒ owner • TDDI12, A. Bednarski, IDA, Linköpings universitet 9.27 Silberschatz, Galvin and Gagne ©2005 File-System Implementation • • • • • • • • TDDI12, A. Bednarski, IDA, Linköpings universitet 9.28 chmod 761 public game Silberschatz, Galvin and Gagne ©2005 File-System Structure File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Log-Structured File Systems TDDI12, A. Bednarski, IDA, Linköpings universitet Attach a group to a file $ chgrp G game group • File structure + Logical storage unit + Collection of related information • File system resides on secondary storage (disks) • File system organized into layers • File control block storage structure consisting of information about a file 9.29 Silberschatz, Galvin and Gagne ©2005 TDDI12, A. Bednarski, IDA, Linköpings universitet 9.30 Silberschatz, Galvin and Gagne ©2005 5 In-Memory File System Structures TDDI12, A. Bednarski, IDA, Linköpings universitet 9.31 Silberschatz, Galvin and Gagne ©2005 Allocation Methods • Directory Implementation • Linear list of file names with pointer to the data blocks. + simple to program + time-consuming to execute • Hash Table – linear list with hash data structure. + decreases directory search time + collisions – situations where two file names hash to the same location + fixed size TDDI12, A. Bednarski, IDA, Linköpings universitet 9.32 Silberschatz, Galvin and Gagne ©2005 Contiguous Allocation An allocation method refers to how disk blocks are allocated for files: • Each file occupies a set of contiguous blocks on the disk. • Simple – only starting location (block #) and length (number of blocks) are required. • Random access. • Wasteful of space (dynamic storage-allocation problem). • Files cannot grow. • Mapping from logical to physical. + Contiguous allocation + Linked allocation + Indexed allocation Q LA/512 R + Block to be accessed = Q + starting address + Displacement into block = R TDDI12, A. Bednarski, IDA, Linköpings universitet 9.33 Silberschatz, Galvin and Gagne ©2005 Contiguous Allocation of Disk Space TDDI12, A. Bednarski, IDA, Linköpings universitet 9.34 Silberschatz, Galvin and Gagne ©2005 Linked Allocation • Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. Block = Pointer Data TDDI12, A. Bednarski, IDA, Linköpings universitet 9.35 Silberschatz, Galvin and Gagne ©2005 TDDI12, A. Bednarski, IDA, Linköpings universitet 9.36 Silberschatz, Galvin and Gagne ©2005 6 Linked Allocation (Cont.) • • • • Indexed Allocation Simple – need only starting address Free-space management system – no waste of space No random access File-allocation table (FAT) Mapping disk-space allocation used by Q MS-DOS and OS/2. LA/511 R • • + Block to be accessed is the Qth block in the linked chain of blocks representing the file. + Displacement into block = R + 1 TDDI12, A. Bednarski, IDA, Linköpings universitet Brings all pointers together into the index block. Logical view. index table 9.37 Silberschatz, Galvin and Gagne ©2005 Example of Indexed Allocation TDDI12, A. Bednarski, IDA, Linköpings universitet 9.38 Silberschatz, Galvin and Gagne ©2005 Indexed Allocation (Cont.) • • • • Need index table Random access Dynamic access without external fragmentation, but have overhead of index block. Mapping from logical to physical in a file of maximum size of 256K words and block size of 512 words. We need only 1 block for index table. Q LA/512 R Q = displacement into index table R = displacement into block TDDI12, A. Bednarski, IDA, Linköpings universitet 9.39 Silberschatz, Galvin and Gagne ©2005 Indexed Allocation – Mapping (Cont.) • • Mapping from logical to physical in a file of unbounded length (block size of 512 words). Linked scheme – link blocks of index table (no limit on size). TDDI12, A. Bednarski, IDA, Linköpings universitet 9.40 Silberschatz, Galvin and Gagne ©2005 Indexed Allocation – Mapping (Cont.) • Two-level index (maximum file size is 5123) Q1 LA / (512 x 512) R1 Q1 LA / (512 x 511) R1 Q1 = block of index table R1 is used as follows: Q1 = displacement into outer-index R1 is used as follows: Q2 R1 / 512 R2 Q2 = displacement into block of index table R2 displacement into block of file: TDDI12, A. Bednarski, IDA, Linköpings universitet 9.41 Q2 R1 / 512 R2 Q2 = displacement into block of index table R2 displacement into block of file: Silberschatz, Galvin and Gagne ©2005 TDDI12, A. Bednarski, IDA, Linköpings universitet 9.42 Silberschatz, Galvin and Gagne ©2005 7 Combined Scheme: UNIX (4K bytes per block) Free-Space Management • Bit vector (n blocks) 0 1 2 n-1 bit[i] = 678 … 0 ⇒ block[i] free 1 ⇒ block[i] occupied Block number calculation (number of bits per word) * (number of 0-value words) + offset of first 1 bit TDDI12, A. Bednarski, IDA, Linköpings universitet 9.43 Silberschatz, Galvin and Gagne ©2005 Free-Space Management (Cont.) • • • • 9.45 9.44 Silberschatz, Galvin and Gagne ©2005 Free-Space Management (Cont.) • Bit map requires extra space + Example: block size = 212 bytes disk size = 230 bytes (1 gigabyte) n = 230/212 = 218 bits (or 32K bytes) + Easy to get contiguous files Linked list (free list) + Cannot get contiguous space easily + No waste of space Grouping Counting TDDI12, A. Bednarski, IDA, Linköpings universitet TDDI12, A. Bednarski, IDA, Linköpings universitet Silberschatz, Galvin and Gagne ©2005 Efficiency and Performance Need to protect: + Pointer to free list + Bit map ̶ Must be kept on disk ̶ Copy in memory and disk may differ ̶ Cannot allow for block[i] to have a situation where bit[i ] = 1 in memory and bit[i ] = 0 on disk + Solution: ̶ Set bit[i ] = 1 in disk ̶ Allocate block[i ] ̶ Set bit[i ] = 1 in memory TDDI12, A. Bednarski, IDA, Linköpings universitet 9.46 Silberschatz, Galvin and Gagne ©2005 Page Cache • • Efficiency dependent on: + Disk allocation and directory algorithms + Types of data kept in file’s directory entry • Performance + Disk cache separate section of main memory for frequently used blocks + Free-behind and read-ahead techniques to optimize sequential access + Improve PC performance by dedicating section of memory as virtual disk, or RAM disk. • • • A page cache caches pages rather than disk blocks using virtual memory techniques Memory-mapped I/O uses a page cache Routine I/O through the file system uses the buffer (disk) cache A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O I/O using a unified buffer cache I/O without a unified buffer cache TDDI12, A. Bednarski, IDA, Linköpings universitet 9.47 Silberschatz, Galvin and Gagne ©2005 TDDI12, A. Bednarski, IDA, Linköpings universitet 9.48 Silberschatz, Galvin and Gagne ©2005 8 Redundant Arrays of Independent Disks (RAID) • • • • • RAID – multiple disk drives provides reliability via redundancy. RAID is arranged into six different levels. Several improvements in disk-use techniques involve the use of multiple disks working cooperatively. Disk striping uses a group of disks as one storage unit. RAID schemes improve performance and improve reliability of the storage system by storing redundant data. + Mirroring or shadowing keeps duplicate of each disk. + Block interleaved parity uses much less redundancy. TDDI12, A. Bednarski, IDA, Linköpings universitet 9.49 Silberschatz, Galvin and Gagne ©2005 Log Structured File Systems Log structured (or journaling) file systems record each update to the file system as a transaction • All transactions are written to a log + A transaction is committed once it is written to the log + However, the file system may not yet be updated • Transactions in the log are asynchronously written to the file system + When the file system is modified, the transaction is removed from the log • If the file system crashes, all remaining transactions in the log must still be performed 9.51 • Consistency checking – compares data in directory structure with data blocks on disk, and tries to fix inconsistencies • Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape, magnetic disk, optical) • Recover lost file or disk by restoring data from backup TDDI12, A. Bednarski, IDA, Linköpings universitet 9.50 Silberschatz, Galvin and Gagne ©2005 Recommended Reading and Exercises • TDDI12, A. Bednarski, IDA, Linköpings universitet Recovery Silberschatz, Galvin and Gagne ©2005 Exercise (Memory Management) • Reading: + [SGG7] Chapter 10 and 11 + Chapter 11 and 12 (sixth edition) • Exercises: + All TDDI12, A. Bednarski, IDA, Linköpings universitet 9.52 Silberschatz, Galvin and Gagne ©2005 Moving-head Disk Mechanism For each of the following decimal addresses, compute the virtual page number and the offset for a 4KB (K Byte) and 8KB page: 20000, 32768, and 60000. TDDI12, A. Bednarski, IDA, Linköpings universitet 9.53 Silberschatz, Galvin and Gagne ©2005 TDDI12, A. Bednarski, IDA, Linköpings universitet 9.54 Silberschatz, Galvin and Gagne ©2005 9