Files and file allocation Copyright ©: Nahrstedt, Angrave, Abdelzaher, Caccamo 1 Copyright ©: Nahrstedt, Angrave, Abdelzaher What is an inode? An inode (index node) is a control structure that contains key information needed by the OS to access a particular file. Several file names may be associated with a single inode, but each file is controlled by exactly ONE inode. On the disk, there is an inode table that contains the inodes of all the files in the filesystem. When a file is opened, its inode is brought into main memory and stored in a memory-resident inode table. [classbook at pag.700] 2 Copyright ©: Nahrstedt, Angrave, Abdelzaher Information in the inode 3 Copyright ©: Nahrstedt, Angrave, Abdelzaher Directories In Unix a directory is simply a file that contains a list of file names plus pointers to associated inodes Inode table Directory i1 Name1 i2 Name2 i3 Name3 i4 Name4 … … 4 Copyright ©: Nahrstedt, Angrave, Abdelzaher Directories DIR *opendir(const char *name); The opendir() function opens a directory stream corresponding to the directory name, and returns a pointer to the directory stream. The stream is positioned at the first entry in the directory. 5 Copyright ©: Nahrstedt, Angrave, Abdelzaher Directories struct dirent *readdir(DIR *dir); The readdir() function returns a pointer to a dirent structure representing the next directory entry in the directory stream pointed to by dir. It returns NULL on reaching the end-offile or if an error occurred. The dirent structure is defined as follows: struct dirent { ino_t d_ino; char d_name[256]; … }; /* inode number */ /* filename */ 6 Copyright ©: Nahrstedt, Angrave, Abdelzaher Directories // error handling is not included! struct dirent *entry; DIR *dirp; dirp = opendir("."); while((entry = readdir(dirp)) != NULL) printf("%s\n",entry->d_name); closedir(dirp); 7 Copyright ©: Nahrstedt, Angrave, Abdelzaher readdir() is not thread safe readdir() returns a pointer to a statically allocated structure; hence the returned pointer points to data which may be overwritten by another call to readdir() on the same directory stream. The data is not overwritten by another call to readdir() on a different directory stream. pointer to storage allocated by the user The reentrant version of readdir() is: int readdir_r(DIR *dirp, struct dirent *entry, struct dirent **result); 8 Copyright ©: Nahrstedt, Angrave, Abdelzaher UNIX file structure implementation Each user has a "file descriptor table“. Each entry in the "file descriptor table" is a pointer to an entry in the system-wide "open file table" Each entry in the open file table contains a file offset (file pointer) and a pointer to an entry in the "memory-resident inode table" If a process opens an already-open file, a new open file table entry is created (with a new file offset), pointing to the same entry in the memory-resident i-node table If a process forks, the child gets a copy of the "file descriptor table" (and thus the same file offset) 9 Copyright ©: Nahrstedt, Angrave, Abdelzaher UNIX file structure implementation [figure from http://pages.cs.wisc.edu/~swift/classes/cs537-sp09/lectures/14-unix-fs.pdf] 10 Copyright ©: Nahrstedt, Angrave, Abdelzaher File opening A call to open() creates a new entry (open file description) in the system-wide table of open files. This entry records the file offset and the file status flags. A file descriptor is a reference to one of these entries. The new open file description is initially not shared with any other process, but sharing may arise via fork. 11 Copyright ©: Nahrstedt, Angrave, Abdelzaher File Allocation on Disk Low level access methods for a file depend upon the disk allocation scheme used to store file data Contiguous Linked list Block or indexed 12 Copyright ©: Nahrstedt, Angrave, Abdelzaher #1. Contiguous Allocation 13 Copyright ©: Nahrstedt, Angrave, Abdelzaher Contiguous Allocation Issues Access method suits sequential and direct access Directory table maps files into starting physical address and length Easy to recover in event of system crash Fast, often requires no head movement and when it does, head only moves one track 14 Copyright ©: Nahrstedt, Angrave, Abdelzaher Contiguous Allocation Issues File is allocated large contiguous chunks Expanding the file requires copying Dynamic storage allocation - first fit, best fit External fragmentation occurs on disk 15 Copyright ©: Nahrstedt, Angrave, Abdelzaher External Fragmentation Solution: Linked allocation 16 Copyright ©: Nahrstedt, Angrave, Abdelzaher #2. Linked Allocation 17 Copyright ©: Nahrstedt, Angrave, Abdelzaher Linked List Allocation Each file is a linked list of nodes Pointers in list are not accessible to user Directory table maps files into head of list for a file A node in the list can be a fixed size physical block or a contiguous collection of blocks Easy to use - no estimation of size necessary 18 Copyright ©: Nahrstedt, Angrave, Abdelzaher Linked List Allocation Can grow in middle and at ends Space efficient, little fragmentation Slow - defies the principle of locality. Need to read through linked list of nodes sequentially to find the needed blocks of data Suited for sequential access but not direct access 19 Copyright ©: Nahrstedt, Angrave, Abdelzaher Linked List Allocation Issues Disk space must be used to store pointers (if disk block is 512 bytes, and disk address requires 4 bytes, then the user sees blocks of 508 bytes) Not very reliable. System crashes can scramble files being updated Important variation on linked allocation method: `file-allocation table' (FAT) - OS/2 and MS-DOS 20 Copyright ©: Nahrstedt, Angrave, Abdelzaher Linked List Allocation Issues Summary: linked allocation solves the external fragmentation and sizedeclaration problems of contiguous allocation, However, it can't support efficient direct access 21 Copyright ©: Nahrstedt, Angrave, Abdelzaher #3. Indexed Allocation 22 Copyright ©: Nahrstedt, Angrave, Abdelzaher Indexed Allocation Solves external fragmentation Supports sequential, direct and indexed access Access requires at most one access to index block first. This can be cached in main memory 23 Copyright ©: Nahrstedt, Angrave, Abdelzaher Indexed Allocation Requires extra space for index block, possible wasted space How to extend to big files? A file can be extended by using linked indexed files or multilevel indexed files 24 Copyright ©: Nahrstedt, Angrave, Abdelzaher Linked Indexed Files Link full index blocks together using last entry. 25 Copyright ©: Nahrstedt, Angrave, Abdelzaher Multilevel Indexed File Multiple levels of index blocks 26 Copyright ©: Nahrstedt, Angrave, Abdelzaher Layout of a UNIX file on disk File allocation is done on a block basis and allocation is dynamic (as needed). UNIX uses a multilevel indexing mechanism for file allocation on disk. Addresses of first 10 data blocks + 3 index blocks (first, second, and third level of indexing) In UNIX System V the length of a block is 1 Kbyte and each block can hold a total of 256 block addresses According to above parameters, maximum size for a file is slightly over 16Gbytes 27 Copyright ©: Nahrstedt, Angrave, Abdelzaher Layout of a UNIX file on disk First ten addresses point to the first 10 data blocks of the file The inode includes 39 bytes of address information that is organized as thirteen 3-byte addresses 28