Outline • Announcements • File Management – Structured files – Low-level file implementations • Directories • File Systems Announcements • Final exam will be cumulative – The split between the before-midterm and after will be around 30% (before-midterm) and 70 % (after-midterm) • Lab3 – I made some of the codes I have available to you • http://www.cs.fsu.edu/~liux/courses/cop4610/assignments/lab3_code • or on program or linprog at ~liux/public_html/courses/cop4610/assignments/lab3_code – A demo program that is only implemented the functionality partially ~liux/public_html/courses/cop4610/assignments/lab3 on on program or linprog 5/29/2016 COP4610 2 Announcements – cont. • If you have not signed to demonstrate your lab2, please do so – If none of the available slots works for you, please contact Yong Chen for an appointment – You have to do the demonstration on or before November 26, 2003 • After that, it will be graded solely based on source code and test results you turned in and the worst cases will be assumed if there are any questions 5/29/2016 COP4610 3 Announcements – cont. • Scheduling for the remaining semester – Homework #5 will be due on Nov. 25, 2003 – Lab3 will be due on Dec. 3, 2003 – Quiz #3 will be on Nov. 25, 2003 (About deadlocks and memory management) – Finish file system today – Finish memory management in three lectures – One lecture on protection and security – Dec. 2 will cover some advanced topics in distributed operating systems – Dec. 4 will have a final review 5/29/2016 COP4610 4 Announcements – cont. • There will be no class on Nov. 11 and no recitation on Nov. 12 • There will be no class on Nov. 27 (thanksgiving) and no recitation class on Nov. 26 • I plan also NOT to have the recitation class on Nov. 19 – If there is a need, we will have 5/29/2016 COP4610 5 Operating System Components 5/29/2016 COP4610 6 Why Programmers Need Files HTML Editor <head> … </head> <body> … </body> Web Browser foo.html File Manager <head> … </head> <body> … </body> • Persistent storage • Shared device 5/29/2016 COP4610 File Manager • Structured information • Can be read by any application • Accessibility • Protocol 7 File system context 5/29/2016 COP4610 8 Fig 13-2: The External View of the File Manager Application Program Memory Mgr Process Mgr File Mgr UNIX Device Mgr WriteFile() CreateFile() CloseHandle() ReadFile() SetFilePointer() Memory Mgr Process Mgr Device Mgr File Mgr mount() write() close() open() read() lseek() Windows Hardware 5/29/2016 COP4610 9 Levels in a file system 5/29/2016 COP4610 10 Information Structure 5/29/2016 COP4610 11 Logical structures in a file 5/29/2016 COP4610 12 Low-Level Files 5/29/2016 COP4610 13 File systems • File system – A data structure on a disk that holds files • actually a file system is in a disk partition • a technical term different from a “file system” as the part of the OS that implements files • File systems in different OSs have different internal structures 5/29/2016 COP4610 14 A file system layout 5/29/2016 COP4610 15 File system descriptor • The data structure that defines the file system • Typical fields – – – – size of the file system (in blocks) size of the file descriptor area first block in the free block list location of the file descriptor of the root directory of the file system – times the file system was created, last modified, and last used 5/29/2016 COP4610 16 File system layout variations • MS/DOS uses a FAT (file allocation table) file system – so does the Macintosh OS (although the MacOS layout is different) • New UNIX file systems use cylinder groups (mini-file systems) to achieve better locality of file data 5/29/2016 COP4610 17 Locating file data • The logical file is divided into logical blocks • Each logical block is mapped to a physical disk block • The file descriptor contains data on how to perform this mapping – there are many methods for performing this mapping – we will look at several of them 5/29/2016 COP4610 18 Dividing a file into blocks 5/29/2016 COP4610 19 Contiguous Allocation • Each file occupies a set of contiguous blocks on the disk – – – – Simple – only starting location and length are required Random access Wasteful of space (dynamic storage-allocation problem) Files cannot grow • Mapping from logical to physical Q LA/512 R – Block to be accessed = Q + starting address – Displacement into block = R 5/29/2016 COP4610 20 A contiguous file 5/29/2016 COP4610 21 A contiguous file – cont. 5/29/2016 COP4610 22 Allocation Strategies • Best fit – Chooses the minimum contiguous block that is large enough • First fit – Chooses the contiguous block that is large enough • Worst fit – Chooses the maximum contiguous block that is large enough 5/29/2016 COP4610 23 Keeping a file in pieces • We need a block pointer for each logical block, an array of block pointers – block mapping indexes into this array – Each file is a linked list of disk blocks • But where do we keep this array? – usually it is not kept as contiguous array – the array of disk pointers is like a second related file (that is 1/1024 as big) 5/29/2016 COP4610 24 Block pointers in the file descriptor 5/29/2016 COP4610 25 Block pointers in contiguous disk blocks 5/29/2016 COP4610 26 Block pointers in the blocks 5/29/2016 COP4610 27 Block pointers in the blocks – cont. 5/29/2016 COP4610 28 Block pointers in an index block 5/29/2016 COP4610 29 Block pointers in an index block – cont. 5/29/2016 COP4610 30 Chained index blocks 5/29/2016 COP4610 31 Two-level index blocks 5/29/2016 COP4610 32 Two-level index blocks – cont. primary index secondary index table 5/29/2016 COP4610 data blocks 33 inode mode owner … Direct block 0 Direct block 1 … Direct block 11 Single indirect Double indirect Triple indirect Data Data Data Index Data Index Data Index Index Data Index Index Data Index UNIX Hybrid Method Index Index Data Data Inverted disk block index (FAT) 5/29/2016 COP4610 35 DOS FAT Files File Descriptor 43 Disk Block 254 Disk Block … 107 Disk Block File Descriptor 43 43 107 Disk Block 254 Disk Block … 107 Disk Block 254 File Access Table (FAT) 5/29/2016 COP4610 36 Free-Space Management • Bit vector (n blocks) 0 1 2 n-1 bit[i] = … 1 block[i] free 0 block[i] occupied • First free block number (number of bits per word) * (number of 0-value words) + offset of first 1 bit 5/29/2016 COP4610 37 Free-Space Management - cont. • Bit map requires extra space. Example: block size = 212 bytes disk size = 230 bytes (1 gigabyte) n = 230/212 = 218 bits (or 32K bytes) • Easy to get contiguous files • Linked list (free list) – Cannot get contiguous space easily – No waste of space 5/29/2016 COP4610 38 Free list organization 5/29/2016 COP4610 39 Free-Space Management - cont. • Need to protect: – Pointer to free list – Bit map • Must be kept on disk • Copy in memory and disk may differ. • Cannot allow for block[i] to have a situation where bit[i] = 0 in memory and bit[i] = 1 on disk. – Solution: • Set bit[i] = 0 in disk. • Allocate block[i] • Set bit[i] = 0 in memory 5/29/2016 COP4610 40 Implementing Low Level Files • Secondary storage device contains: – Volume directory (sometimes a root directory for a file system) – External file descriptor for each file – The file contents • Manages blocks – Assigns blocks to files (descriptor keeps track) – Keeps track of available blocks • Maps to/from byte stream 5/29/2016 COP4610 41 Disk Organization Boot Sector Volume Directory … Blk0 Blk1 Blkk Blkk+1 Blkk-1 Track 0, Cylinder 0 Blk2k-1 Track 0, Cylinder 1 … Blk Track 1, Cylinder 0 … Blk Track N-1, Cylinder 0 … Blk Track N-1, Cylinder M-1 … … Blk Blk … Blk Blk … Blk 5/29/2016 Blk COP4610 42 Low-level File System Architecture Block 0 b0 b1 b2 b3 … … bn-1 ... Sequential Device 5/29/2016 Randomly Accessed Device COP4610 43 File Descriptors •External name •Current state •Sharable •Owner •User •Locks •Protection settings •Length •Time of creation •Time of last modification •Time of last access •Reference count •Storage device details 5/29/2016 COP4610 44 An open() Operation • • • • Locate the on-device (external) file descriptor Extract info needed to read/write file Authenticate that process can access the file Create an internal file descriptor in primary memory • Create an entry in a “per process” open file status table • Allocate resources, e.g., buffers, to support file usage 5/29/2016 COP4610 45 File Manager Data Structures 2 Keep the state of the processfile session 3 Return a reference to the data structure Process-File Session Open File Descriptor 1 Copy info from external to the open file descriptor External File Descriptor 5/29/2016 COP4610 46 Opening a UNIX File fid = open(“fileA”, flags); … read(fid, buffer, len); 0 1 2 3 stdin stdout stderr ... On-Device File Descriptor File structure inode Open File Table Internal File Descriptor 5/29/2016 COP4610 47 Reading and Writing the Byte Stream • Two stages – Reading bytes into or writing bytes out of the memory copy of the block – Reading the physical blocks into or writing them out of memory from/to storage devices – Packing or unmarshalling procedure converts secondary storage blocks into a byte stream – Unpacking or marshalling procedure converts a byte stream into blocks 5/29/2016 COP4610 48 Marshalling the Byte Stream • Must read at least one buffer ahead on input • Must write at least one buffer behind on output • Seek flushing the current buffer and finding the correct one to load into memory • Inserting/deleting bytes in the interior of the stream 5/29/2016 COP4610 49 File Block Buffering • Storage devices use block I/O and files place an explicit order on the bytes – Therefore, it is possible to predict what is likely to be read after a byte – When file is opened, manager reads as many blocks ahead as feasible – After a block is logically written, it is queued for writing behind, whenever the disk is available • Buffering has an enormous effect on the overall performance of the system – Buffer pool – usually variably sized, depending on virtual memory needs • Interaction with the device manager and memory manager 5/29/2016 COP4610 50 Supporting Other Storage Abstractions • Low-level file systems avoid encoding record-level functionality – If applications use very large or very small records, a generic file manager may not be efficient – Some operating systems provide a higher-layer file system to support applications with large or small files – Database management systems and multimedia documents are examples 5/29/2016 COP4610 51 Structured Files 5/29/2016 COP4610 52 Record-Oriented Sequential Files 5/29/2016 COP4610 53 Electronic Mail Example 5/29/2016 COP4610 54 Indexed Sequential Files 5/29/2016 COP4610 55 Database Management Systems • A database is a very highly structured set of information – Stored across different files – Optimized to minimize access time • DBMSs implementation – Some DBMSs use the normal files provided by the OS for generic use – Some use their own storage device block 5/29/2016 COP4610 56 Disk compaction 5/29/2016 COP4610 57 Memory-mapped Files • A file’s contents are mapped directly into the virtual address space – Files can be read from or written to by referencing the corresponding virtual addresses • Memory-mapped files are very useful when a file is shared or accessed repeatedly 5/29/2016 COP4610 58 Memory-mapped Files – cont. 5/29/2016 COP4610 59 Directories • A directory is a set of logically associated files and other directories of files – Directories are the mechanism we use to organize files • The file manager provides a set of commands to manage directories – Traverse a directory – Enumerate a list of all files and nested directories 5/29/2016 COP4610 60 Directory Structures • How should files be organized within directory? – Flat name space • All files appear in a single directory – Hierarchical name space • Directory contains files and subdirectories • Each file/directory appears as an entry in exactly one other directory -- a tree • Popular variant: All directories form a tree, but a file can have multiple parents. 5/29/2016 COP4610 61 Directory Structures 5/29/2016 COP4610 62 Directory Structures – cont. 5/29/2016 COP4610 63 A directory tree 5/29/2016 COP4610 64 Directory Implementation • Device Directory – A device can contain a collection of files – Easier to manage if there is a root for every file on the device -- the device root directory • File Directory – Typical implementations have directories implemented as a file with a special format – Entries in a file directory are handles for other files (which can be files or subdirectories) 5/29/2016 COP4610 65 Directory Implementation • Linear list of file names with pointer to the data blocks. – simple to program – time-consuming to execute • Hash Table – linear list with hash data structure. – decreases directory search time – collisions – situations where two file names hash to the same location – fixed size 5/29/2016 COP4610 66 Mounting file systems • Each file system has a root directory • We can combine file systems by mounting – that is, link a directory in one file system to the root directory of another file system • This allows us to build a single tree out of several file systems • This can also be done across a network, mounting file systems on other machines 5/29/2016 COP4610 67 UNIX mount Command / / bin usr etc bill bin usr etc foo bill nutt foo / nutt FS abc cde xyz / FS abc cde xyz blah 5/29/2016 blah COP4610 mount FS at foo 68 Mounting a file system 5/29/2016 COP4610 69 VFS-based File Manager Exports OS-specific API File System Independent Part of File Manager Virtual File System Switch MS-DOS Part of File Manager 5/29/2016 ISO 9660 Part of File Manager COP4610 … ext2 Part of File Manager 70 NFS Architecture 5/29/2016 COP4610 71 Summary • File storage methods – – – – Contiguous files File pointers in the file descriptor Contiguous file pointers Chained data blocks • • • • Chained single index blocks Double index blocks Triple index blocks Hybrid solutions • Directories • File Systems 5/29/2016 COP4610 72