Last on TTIT61 Binding Compile time, load time, execution time Swapping Contiguous memory allocation External fragmentation Paging Internal fragmentation, sharing, protection Segmentation External fragmentation, sharing, protection Virtual memory Page replacement Thrashing File Systems and Mass Storage Alexandru Andrei alean@ida.liu.se phone: 013322828, room: B 3D:439 1 A. Andrei, Process programming and operating systems, File systems and mass storage A. Andrei, Process programming and operating systems, File systems and mass storage Lecture Plan 3 Files A. Andrei, Process programming and operating systems, File systems and mass storage 4 File Attributes Name (identifier for human use) Identifier (typically a numeric identifier for the internal use of the OS) Type (for OS that support file types) Size Time, date, user identification (last access, last modification, creation) Location (on the device) Protection (permissions to read/write/execute/etc.) Named collection of related information that is stored on secondary storage Smallest allotment of logical secondary storage (when we want to store something on the secondary storage, we store it in files) Format of files is typically defined by the creator (txt, doc, mp3, avi, elf, coff, exe, so, dll, …) A. Andrei, Process programming and operating systems, File systems and mass storage Outline The concept of file Operations on files Access methods Allocation methods Directories Operations on directories Directory hierarchies File sharing Protection Disk scheduling 1. What is an operating system? What are its functions? Basics of computer architectures. (Part I of the textbook) 2. Processes, threads, schedulers (Part II , chap. III-V) 3. Synchronization & Deadlock (Part II, chap. VI, VII) 4. Primary memory management. (Part III, chap. VIII, IX) 5. File systems and secondary memory management (Part IV, chap. X, XI, XII) 6. Security (Part V, chap. XIV) A. Andrei, Process programming and operating systems, File systems and mass storage 2 5 A. Andrei, Process programming and operating systems, File systems and mass storage 6 1 Disks Disk Organisation Tracks Sector Gap A. Andrei, Process programming and operating systems, File systems and mass storage 7 Disk Organisation A. Andrei, Process programming and operating systems, File systems and mass storage 8 Virtual File System The geometry of disks is given in C/H/S (Cylinders/Heads/Sectors) Initially used to corresponded to the true physical geometry The access granularity is the physical block (sector) The operating system maps logical records on physical blocks Disk space is always allocated in blocks Files may not have a size equal to an integer multiple of the block size ⇒ last block not fully used Internal fragmentation A. Andrei, Process programming and operating systems, File systems and mass storage 9 Operations on Files A. Andrei, Process programming and operating systems, File systems and mass storage 10 Accessing a File Creation Name Protection information Deletion Name Writing Reading Truncating Repositioning within a file A. Andrei, Process programming and operating systems, File systems and mass storage 11 A. Andrei, Process programming and operating systems, File systems and mass storage 12 2 Opening a File If the file to be read from (written to) was specified by its name to the read (write) system calls The OS would have to lookup the disk blocks corresponding to the named file for each system call invocation ⇒ performance penalty Most OS require the user to perform an open system call that Access Methods Direct access Sequential access Indexed access Maps the file name to an identifier a memory structure with the disk location of the opened file and other data (see slides) Initialises Implicit opening: automatically open at first access, close at process exit A. Andrei, Process programming and operating systems, File systems and mass storage 13 A. Andrei, Process programming and operating systems, File systems and mass storage Direct Access Sequential Access Sequential access The block from where to read (where to write) is not specified. The OS keeps a file pointer that it modifies accordingly. A read (write) operation reads (writes) data from the current file offset, stored in the file pointer After the read or write, the value of the file pointer is incremented with the amount of transferred records. E.g.: write(fd, buf, sizeof(buf)) – write sizeof(buf) bytes from the buffer buf to the file identified by fd at the current file offset. Increment the file pointer with sizeof(buf) Direct access Read (write) system call do specify the relative block number from where to read (where to write to) E.g. write(fd, buf, sizeof(buf), 30); -- writes sizeof(buf) bytes from the buffer buf to the 30th block of the file identified by fd. A. Andrei, Process programming and operating systems, File systems and mass storage 15 A. Andrei, Process programming and operating systems, File systems and mass storage Indexed Access Index file 16 Writing to Files in Unix Writing Which file to write to What to write The write system call does not specify a file offset (a write pointer pointing at the position in the file where the writing should begin) Block 1311 143520, $10 245679, $30 509877, $5 510978, $15 143520, 1311 607896, 1312 853661, 1400 14 Block 1312 607896, $20 610942, $10 790134, $15 829842, $8 Block 1400 853661, $10 898541, $20 934625, $6 973147, $7 Sequential access File A. Andrei, Process programming and operating systems, File systems and mass storage 17 A. Andrei, Process programming and operating systems, File systems and mass storage 18 3 Reading from Files in Unix Reading Which file to read from How much to read Where to put what we read The read system call does not specify a file offset (a read pointer pointing at the position in the file where the reading should begin) A. Andrei, Process programming and operating systems, File systems and mass storage Read/Write Pointers Typically, a file is used either for reading or for writing by a process The OS keeps just one single file pointer for both reading and writing 19 A. Andrei, Process programming and operating systems, File systems and mass storage File Operations in Unix 20 Usage Example int fd, n; char wbuf[] = “Hello!\n”, rbuf[100]; int creat(const char *name, int permissions) int open(const char *name, int flags) int read(int fd, char *buffer, int requested_size); int write(int fd, const char *buffer, int size); int close(int fd); long lseek(int fd, long offset, int whence); int stat(const char *name, struct stat *status); A. Andrei, Process programming and operating systems, File systems and mass storage fd = creat(“file.txt”, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); write(fd, wbuf, sizeof(wbuf)); n = read(fd, rbuf, MAX_BUF); close(fd); From which file offset will read read? What's the value of n? What will rbuf contain? 21 A. Andrei, Process programming and operating systems, File systems and mass storage Usage Example Allocation Methods int fd, n; char wbuf[] = “Hello!\n”, rbuf[100]; An allocation method refers to how disk blocks are allocated for files: fd = creat(“file.txt”, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); write(fd, wbuf, sizeof(wbuf)); close(fd); fd = open(“file.txt”, O_RDONLY); n = read(fd, rbuf, MAX_BUF); close(fd); Contiguous allocation A. Andrei, Process programming and operating systems, File systems and mass storage 23 22 Linked allocation Indexed allocation A. Andrei, Process programming and operating systems, File systems and mass storage 24 4 Contiguous Allocation Contiguous Allocation of Disk Space Each file occupies a set of contiguous blocks on the disk Simple – only starting location (block #) and length (number of blocks) are required Random access Wasteful of space (dynamic storage-allocation problem) Files cannot grow A. Andrei, Process programming and operating systems, File systems and mass storage 25 Contiguous Allocation A. Andrei, Process programming and operating systems, File systems and mass storage 26 Extent-Based Systems Many newer file systems (i.e. Veritas File System, EXT4, NTFS) use a modified contiguous allocation scheme Extent-based file systems allocate disk blocks in extents An extent is a contiguous block of disks Extents are allocated for file allocation A file consists of one or more extents. (a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and E have been removed A. Andrei, Process programming and operating systems, File systems and mass storage 27 Linked Allocation A. Andrei, Process programming and operating systems, File systems and mass storage 28 Linked Allocation Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. block = pointer data A. Andrei, Process programming and operating systems, File systems and mass storage 29 A. Andrei, Process programming and operating systems, File systems and mass storage 30 5 Linked Allocation (Cont.) Indexed Allocation Brings all pointers together into the index block Logical view Simple – need only starting address Free-space management system – no waste of space No random access In case of an error (corrupted block), the rest of the file is lost index table A. Andrei, Process programming and operating systems, File systems and mass storage 31 Example of Indexed Allocation A. Andrei, Process programming and operating systems, File systems and mass storage 33 Indexed Allocation (Cont.) 32 File-Allocation Table A. Andrei, Process programming and operating systems, File systems and mass storage 34 FAT Problems Inefficient for large disks, because the FAT itself requires a large amount of memory Tradeoff: size of the disk cluster vs. size of the FAT FAT16 (Win95): the disk is divided into 216 clusters 2 byte entry for each cluster => 128kB to store the FAT in the memory For a disk of 1GB, the size of the cluster is 16kB => if the files are small, huge waste FAT32 (Win98): 228 clusters, 4 bytes for each Do the same calculation Need index table Random access Dynamic access without external fragmentation, but have overhead of index block. Mapping from logical to physical in a file of maximum size of 256KB and block size of 512B (1 sector on the disk). We need only 1 block for index table (512B). A. Andrei, Process programming and operating systems, File systems and mass storage A. Andrei, Process programming and operating systems, File systems and mass storage 35 A. Andrei, Process programming and operating systems, File systems and mass storage 36 6 Combined Scheme: UNIX (4K bytes per block) A. Andrei, Process programming and operating systems, File systems and mass storage 37 Inode A. Andrei, Process programming and operating systems, File systems and mass storage ? Consider the organization of a UNIX file as represented by the I-Node. Assume there are 12 direct block pointers, and, one single, double and triple indirect pointer in each I-Node. Further, assume that the system block size and the disk sector size are both 8K. If the disk block pointer is 32 bits, with 8 bits to identify the physical disk, and 24 bits to identify the physical block, then: 0 1 2 n-1 bit[i] = 678 … 0 ⇒ block[i] free 1 ⇒ block[i] occupied Block number calculation (number of bits per word) * (number of 0-value words) + offset of first 1 bit 39 Free-Space Management (Cont.) Bit map requires extra space Example: block size = 212 bytes (4k bytes) disk size = 230 bytes (1 gigabyte) 230/212 = 218 blocks on the disk n = 230/212 = 218 bits (or 32K bytes) for the bitmap Easy to get contiguous files A. Andrei, Process programming and operating systems, File systems and mass storage Free-Space Management Bit vector (n blocks) a. What is the maximum file size supported by this system? b. What is the maximum file system partition supported by this system? c. Assuming no information other than the file I-Node is already in system memory, how many disk accesses are required to access the byte in position 13,423,956? A. Andrei, Process programming and operating systems, File systems and mass storage 38 A. Andrei, Process programming and operating systems, File systems and mass storage 40 Free-Space Management (Cont.) An alternative to the bitmap is to use a linked list Linked list (free list) Cannot get contiguous space easily No waste of space Grouping Counting 41 A. Andrei, Process programming and operating systems, File systems and mass storage 42 7 Free-Space Management (Cont.) A Typical File-system Organization Need to protect: Pointer to free list Bit map Must be kept on disk Copy in memory and disk may differ Cannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk Solution: Set bit[i] = 1 in disk Allocate block[i] Set bit[i] = 1 in memory A. Andrei, Process programming and operating systems, File systems and mass storage 43 A. Andrei, Process programming and operating systems, File systems and mass storage Directories Special files that contain directory entries A directory entry is a data structure containing the file attributes Directory entry: / bin/ ls root dir bin, dir, root:root, rwxr-xr-x, 10050 lib, dir, root:root, rwxr-xr-x, 52175 vmlinux-2.6.11, reg, root:root, r-x------, 120311 lib/ libc.so libgcc.so vmlinux-2.6.11 bin dir ls, reg, root:root, r-xr-xr-x, 10052 lib dir libc.so, reg, root:root, r-xr-xr-x, 52177 libgcc.so, reg, root:root, r-xr-xr-x, 60621 A. Andrei, Process programming and operating systems, File systems and mass storage 44 Operations on Directories List the contents of the directory Delete (unlink) a file Rename a file Open and close the directory Search for a file Traverse the file system 45 A. Andrei, Process programming and operating systems, File systems and mass storage Directory Hierarchies Tree-structured directories Leaf nodes are files, all other nodes are directories Acyclic-graph directories Same with the exception that the structure is an acyclic graph General graph directories May contain cycles 46 Acyclic Graph Directories / root dir bin/ bin, dir, root:root, rwxr-xr-x, 10050 lib, dir, root:root, rwxr-xr-x, 52175 vmlinux-2.6.11, reg, root:root, r-x------, 120311 ls lib/ bin dir libc.so libgcc.so kernel lib dir vmlinux-2.6.11 ls, reg, root:root, r-xr-xr-x, 10052 libc.so, reg, root:root, r-xr-xr-x, 52177 libgcc.so, reg, root:root, r-xr-xr-x, 60621 kernel, reg, sys:root, r-xr-x---, 120311 We need reference counters. A file is removed and its blocks marked as free when the reference counter reaches 0. Removing = unlinking A. Andrei, Process programming and operating systems, File systems and mass storage 47 A. Andrei, Process programming and operating systems, File systems and mass storage 48 8 Symbolic Links / root dir bin/ bin, dir, root:root, rwxr-xr-x, 10050 lib, dir, root:root, rwxr-xr-x, 52175 vmlinux-2.6.11, reg, root:root, r-x------, 120311 When files can be shared Should all writes be allowed to occur or should the OS protect the user actions from each other? Should a write be immediately visible to all the other users who share the file? ls lib/ bin dir libc.so libgcc.so kernel lib dir vmlinux-2.6.11 File Sharing ls, reg, root:root, r-xr-xr-x, 10052 libc.so, reg, root:root, r-xr-xr-x, 52177 libgcc.so, reg, root:root, r-xr-xr-x, 60621 kernel, link, guest:guest, rwxrwxrwx, 71220 A hard link is a directory entry pointing to a different file. It contains no blocks of its own A soft link is a special file, very short one, that contains the name of the file that it points to 49 A. Andrei, Process programming and operating systems, File systems and mass storage File Sharing 0000000000 xxxxxxxxxx 8888888888 5555555555 yyyyyyyyyy fd = open(“file.txt”, O_RDWR); n = read(fd, buf, 10); printf(“%s\n”, buf); write(fd, “xxxxxxxxxx”, 10); n = read(fd, buf, 10); printf(“%s\n”, buf); close(fd); OS-wide open file table Proc B open file table … fd = open(“file.txt”, O_RDWR); n = read(fd, buf, 10); printf(“%s\n”, buf); n = read(fd, buf, 5); printf(“%s\n”, buf); n = read(fd, buf, 5); printf(“%s\n”, buf); write(fd, “yyyyyyyyyy”, 10); close(fd); 0 Proc A open file table 1 0 3 30 20 10 15 18 18 file.txt, ftp:users, rw-rw-rw-,0 2 ,10228 1 0 6 20 30 10 A. Andrei, Process programming and operating systems, File systems and mass storage 18 51 Protection Controlled access is introduced by specifying which users (or user groups) are allowed to perform operations on the file Examples of controlled operations: Read Write Execute Append Delete List A. Andrei, Process programming and operating systems, File systems and mass storage A. Andrei, Process programming and operating systems, File systems and mass storage 50 Protection Keep safe from improper access We introduce the notion of file owner A user ID kept on the disk in the directory entry Users have IDs Processes, besides their process IDs (pid), have user IDs, typically the user ID of the user that executes them (user ID (uid) or effective user ID (euid)) Files typically are owned by the user who creates them The effective user ID of the file creator is written in the directory entry Besides owner, a file is characterised by its group A. Andrei, Process programming and operating systems, File systems and mass storage 52 Unix File Protection Read: Read for files, list for directories Write: Write/modify for files, create/delete new entries for directories Execute: Execute for files, change directory rights for directories 53 A. Andrei, Process programming and operating systems, File systems and mass storage 54 9 Access Control Lists (ACL) Use of condensed ACLs instead Use per-owner, per-group, per-others permissions Each file (directory) has an access control list attached The access control list specifies for each controlled operation the users that are allowed to perform this operation E.g.: Sara writes a book, Jim, Dawn, and Jill help her. Sara has all the rights, Jim, Dawn, and Jill may read or write but not delete, all the others may only read Sara is the owner, has rw- permissions A group book is created, the file is owned by user Sara and group book, Jim, Dawn, and Jill are added to group book, the group has rwpermissions Others have r-- permissions The directory in which the book resides has rwxr-xr-x permissions If Sara wants Joe to have read/write access to chapter 1, she cannot add him to group book Instead, user Joe is added to the ACL Advantage: Very general and flexible Disadvantages: Difficult to construct if we do not know all the users beforehand Directory entry of variable size, more difficult to manage A. Andrei, Process programming and operating systems, File systems and mass storage Condensed ACL 55 A. Andrei, Process programming and operating systems, File systems and mass storage Condensed ACL What if: kim:staff rw-r-xr-- script.sh User kim belongs to group staff Should kim be allowed to execute script.sh? If we consider that the permissions of the owner apply, then no If we consider that the permissions of the group apply, then yes Precedence given to most specific Disk Access Scheduling OS has to ensure that the resources are used efficiently Bandwidth = transferred bytes / length of interval between first request and completion of last request Seek time A. Andrei, Process programming and operating systems, File systems and mass storage 56 57 Rotational latency A. Andrei, Process programming and operating systems, File systems and mass storage FCFS Disk Scheduling 58 Shortest Seek Time First Following requests: cylinders 98, 183, 37, 122, 14, 124, 65, 67 Head initially at cylinder 53 0 14 37 53 65 98 122 183 0 37 53 65 98 122 183 Movement of only 236 cylinders May cause starvation Not optimal!! If we moved from 53 to 37 and then 14, before 65, 67, etc. ⇒ 208 cylinders Total head movement of 640 cylinders A. Andrei, Process programming and operating systems, File systems and mass storage 14 59 A. Andrei, Process programming and operating systems, File systems and mass storage 60 10 SCAN Scheduling 0 14 37 53 65 98 122 183 Circular-SCAN Algorithm 0 14 37 53 65 98 122 183 When we reach one end, it is more likely that requests are closer to the other end than close to the read/write head Those also waited the longest ⇒ C-SCAN algorithm A. Andrei, Process programming and operating systems, File systems and mass storage 61 LOOK and C-LOOK Algorithms 0 0 14 14 37 37 53 53 65 65 98 98 122 122 Which One? Real disk geometry is hidden to the OS Disk manufacturers include a scheduling in the hard disk controller Then why not let the hard disk do all the scheduling? Because some requests have different semantics and have to be treated differently (accesses of a higher priority process, or paging, for example) 183 63 Summary Files Operations Sharing Protection Access Allocation Directory hierarchies Disk scheduling algorithms A. Andrei, Process programming and operating systems, File systems and mass storage 62 Depends on the load Depends on file allocation SSTF and LOOK seem reasonable alternatives 183 A. Andrei, Process programming and operating systems, File systems and mass storage A. Andrei, Process programming and operating systems, File systems and mass storage A. Andrei, Process programming and operating systems, File systems and mass storage 64 Reading Silberschatz, Galvin, Gagne, 7th edition, Part IV Chapter 10: 10.1, 10.2, 10.3, 10.5.1, 10.5.3.1, 10.6 Chapter 11: 11.1, 11.2, 11.4, 11.5 Chapter 12: 12.1.1, 12.2, 12.4 65 A. Andrei, Process programming and operating systems, File systems and mass storage 66 11