File Systems Examples MS-DOS File System • Naming: 8+3 in upper case • Directories: Hierarchical directory structure – No soft or hard links – 32 byte directory entry – Max file size: 232 = 4 GB (not possible due to other reasons) 2 FAT • MS-DOS uses a File Allocation Table (FAT) – FAT-12, FAT-16, FAT-32 based on # bits in disk address • Disk block is some multiple of 512 bytes – Block size also known as cluster size in Microsoft terminology – 1st MS-DOS version used FAT-12 with 512-byte blocks • Partition size: 212*512, actually 4086*512 ~ 2 MB • Memory space: 4096 entries of 2 bytes each – Later versions had variable disk block sizes 16 MB partitions • MS-DOS supported 4 disk partitions 64 MB disks • MS-DOS uses FAT to track free disk blocks – Mark free blocks with special code – Does not need free-list of bitmap 3 FAT-16 and FAT-32 • FAT-16: 16-bit disk pointers, with varying block sizes – Block sizes from 512 bytes to 32 KB were supported – Memory requirement: 216*2 bytes = 128 KB – Largest disk partition: 216*32 KB ~ 2 GB • Total of 8 GB disk space • FAT-32: 28-bit disk pointers – Introduced with 2nd version of Windows 95 – Theoretical partition size: 228*32 KB • Internal representation using 32-bits and 512-byte sectors 2 TB – Pros over FAT-16: other than larger disks • 8 GB can be single partition • Smaller block size can be used for same disk partition 4 FAT Comparison 5 Windows 98 FS • Uses FAT-32 and long file names – Also used in Windows Me • New directory structure: • How to store long file names? – Challenge: compatibility with earlier DOS versions – Solution: 2 names: DOS compliant (8 + 3), and original name • SampleFile is also known as SAMPLE~1 6 Storing Long File Names • Use long file name fields to store the longer file name – Attributes field ensures that MS-DOS ignores these entries Checksum 7 NTFS • Partition also called a volume – Cluster size from 512 bytes to 64 KB, usually 4 KB used • Addressing: uses logical cluster numbers • File Structure: object with certain attributes – Opposed to stream of bytes in MS-DOS and UNIX – User data in data attributes • NTFS disk structure: – 12% allocated for MFT area 8 Master File Table (MFT) • Every system component is a file, and MFT is most imp. – Has information of every other file on the disk • MFT divided into records of fixed size (1 to 4 KB) – Each record corresponds to a file – Each file described by one or more records • Small (resident) attributes are stored in MFT record – For small files, even data might be stored in MFT record • Large (nonresident) attributes stored on disk – Pointer stored in MFT • For files with large # of attributes, or high fragmentation – Base file record has info of other records with file info 9 MFT Record Attributes 10 MFT Record for 3-run, 9-block file 11 Storing Large Files 12 Other Details • Each file in NTFS has unique ID called File Reference – 64 bit in length with 48-bit file number, 16-bit sequence number – File number is array slot in MFT for that file – Sequence number incremented on every MFT reuse • Used for internal consistency checks • Directory Structure – As a B+ Tree: no tree reorganization, height of all leaves same – Index root of dir contains top level of B+ Tree • Might point to disk extents (seq. of contiguous blocks) for large dirs – Each dir entry has name, file ref., copy of update timestamp, size 13 NTFS vs FAT directory structure 14 NTFS Metadata • First 16 NTFS files are system files, called metafiles • Disk structure: – The first NTFS file is the MFT file – Second file contains copy of 1st 16 entries of MFT – Next few files are also special • • • • • • • $LogFile: metadata updates to FS $Volume: housekeeping info $Attrdef: list of attributes on volume $. : root directory $Bitmap: volume free space bitmap $Boot: boot sector $Quota: users rights on disk space (from NT5) 15 NTFS Journaling • All FS data structure updates done within transactions – Before altering a data structure write redo and undo information – Write commit record to log after a successful update • Restore FS data structure after a crash by processing log – Redo committed transactions, and undo unsuccessful ones • Periodically write a checkpoint record to log – Log records before checkpoint not required • Note: Journaling does not ensure file data consistency 16 WinFS • Bridge worlds of file systems, relation DBs, objects, XML – Eg. Store your data as object relations, which is also useful to other apps – Just too much work to be done! • Rewrite OS (apps), compatibility with NTFS, … 17 Unix V7 FS • Structured as a tree, starting at the root – File names are 14 ASCII chars, other than / and NUL • Directories: one entry for each file – Each entry has 2 fields: file name (14 bytes) & i-node # (2 bytes) • Number of files is 216 = 64 K 18 Unix I-nodes • I-node attributes: size, times (creation, modification, last access), owner, group, protection, num. of link pointers 19 Steps for /usr/ast/mbox 20 Linux ext2 File System • Very common on Linux systems • Extends on Unix FS – Generic code in /usr/src/linux/fs – ext2 specific code in /usr/src/linux/fs/ext2 • Disk partitioned into “block groups” – Ext2fs block structure: • Similar to FFS cylinder groups 21 The Linux Ext2fs File System • File Allocation: first select a block group – For data blocks, tries to allocate block to group with file’s i-node – For i-node allocations, selects group of file’s parent’s dir – Directory files are dispersed across all block groups • Within block groups, tries to allocate contiguous blocks – Uses a bitmap for free blocks in the group – For first blocks for a file, searches free block from the beginning – For extending file, search for a free byte in bitmap from last alloc – If no free byte, then choose any free bit – After choosing block, back track to not leave holes! 22 Ext2fs Block-Allocation Policies 23 Linux Virtual File System • Operations differ on various OSes – Linux provides abstraction called virtual file system (VFS) • The Linux VFS is designed around OO principles – is composed of two components: – Set of definitions defining what a file object is allowed to look like • inode-object & file-object structures represent individual files • the file system object represents an entire file system – A layer of software to manipulate those objects 24