File System Implementation
CISC3595, Spring 2015
1
Objectives for a File Management System
!
Meet the data management needs of the user!
!
!
!
!
!
!
Provide I/O support for a variety of storage device
types!
Provide a standardized set of I/O interface routines to
user processes!
Provide I/O support for multiple users (if needed)!
Guarantee that the data in the file are valid!
Minimize lost or destroyed data!
Optimize performance
Requirements for a general purpose system
1.
2.
3.
4.
5.
6.
7.
user should be able to create, delete, read, write
and modify files!
user may have controlled access to other users’ files!
user may control what type of accesses are allowed
to his/her files!
user should be able to restructure his/her files!
user should be able to move data between files!
user should be able to back up and recover files in
case of damage!
user should be able to access files using symbolic
names
Virtual File Systems
!
Virtual File Systems (VFS):!
!
!
same system call interface
(API) used for different types
of concrete file systems!
Support numerous file
system types!
!
!
!
!
8
ext2, ufs, fat, vfat, hpfs,
minix, isofs, sysv, hfs, affs,
NTFS!
/proc file system!
NFS, CoDA, AFS ncpfs!
umsdos, userfs
Virtual File Systems (1)
Figure 4-18. Position of the virtual file system.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Mount
!
Various file systems are mounted at different directories (mounting
points) in the files system name space!
$ mount!
$ /dev/sda3 on / type ext3 (rw,relatime,errors=remount-ro)!
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) !
proc on /proc type proc (rw,noexec,nosuid,nodev) !
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) !
varrun on /var/run type tmpfs (rw,nosuid,mode=0755) !
varlock on /var/lock type tmpfs (rw,noexec,nosuid,nodev,mode=1777) udev on /dev
type tmpfs (rw,mode=0755) !
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) !
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) fusectl on /sys/
fs/fuse/connections type fusectl (rw) lrm on /lib/modules/2.6.28-11-generic/volatile
type tmpfs (rw,mode=755) securityfs on /sys/kernel/security type securityfs (rw)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc
(rw,noexec,nosuid,nodev) gvfs-fuse-daemon on /home/zhang/.gvfs type fuse.gvfsfuse-daemon (rw,nosuid,nodev,user=zhang)
9
Hard Disk
Accessing Hard disk !
!
Seek time: in ms!
(moving read head to track)!
Rotation time: in ms!
(wait until sector rotate to read!
head)!
Transmission time: in hundreds of!
MB/s!
# of bytes/ (transfer speed) !
!
!
vs accessing RAM: !
10 ns every 32 bits!
Block
• File system block: the allocation unit of disk storage space !
• Also transfer unit when read/write disk!
• similar concept in paging memory management: page!
• Always two’s power: 512, 1024, 2048, 4096, …!
• Choosing block size: !
• Small block size => ? !
• Large block size => ? !
• In this class (chapter of book), assume an abstract view of
disk: !
• a disk is nothing but an “array” of blocks, !
• We can read block k, write block k!
• We want to support file system service we learnt last
week!
• hierarchy structure!
Disk Space Management Block Size (1)
Figure 4-20. Percentage of files smaller than a given size
(in bytes).
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Disk Space Management Block Size (2)
For more info,!
See book P263!
!
Data rate = !
!
Figure 4-21. The solid curve (left-hand scale) gives the data rate of
a disk. The dashed curve (right-hand scale) gives the disk
space efficiency. All files are 4 KB.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
File System Layout
Boot block: used to boot operating system (i.e., load OS code into RAM)!
Superblock: keep file system parameters !
Free space mgmt: what blocks in this partition is free !
i-nodes: metadata and address of blocks allocated to each file/directory!
Figure 4-9. A possible file system layout.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Outline
!
!
!
!
!
Abstract view of hard disk!
Disk space allocation and management!
Efficiency and Performance!
Recovery!
NFS
20
Allocation Methods
!
Allocate disk space to files!
!
!
Contiguous allocation!
!
!
Linked allocation!
!
!
Indexed allocation
21
Contiguous Allocation
!
!
Each file occupies a set of
contiguous blocks on the disk!
Pros:!
!
!
!
Simple – only starting location (block
#) and length (number of blocks) are
required!
Random access!
Cons:!
!
!
!
22
Wasteful of space: dynamic storageallocation problem: how to satisfy
request from list of non-contiguous
free holes!
External fragmentation!
Files cannot grow
Linked Allocation
!
!
Each file is a linked list of disk
blocks: blocks may be scattered
anywhere on the disk.!
Directory contains pointer to the
first and last blocks of the file.!
! pointer to next block is stored
in block!
! data stored in each block is no
more two’s power
23
Linked List Allocation
Figure 4-11. Storing a file as a linked list of disk blocks.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
File-Allocation Table
!
File-allocation table (FAT)!
!
!
!
one FAT per partition!
!
!
24
collectively store “next block”
for entire file system in one
table!
used in MS-DOS and OS/2
(FAT12, FAT16,FAT32)!
one entry for each disk block!
! store pointer to next block in
file/directory!
For each file, only needs to block #
for the first block!
Linked List Allocation Using a Table in Memory
Figure 4-12. Linked list allocation using a file allocation table
in main memory.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
MS-DOS File System
# of bits per entry
Maximum partition size for different block sizes. The empty boxes represent
forbidden combinations.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Indexed Allocation (i-node, index-node)
!
Brings all pointers (block #s) belonging to a file/
directory together into index block.!
!
25
Each file has its own index block, or i-node (used in
Unix file systems)
Indexed Allocation
entry for a file
26
Combined Scheme: UNIX inode (4K bytes per block)
28
Free-Space Management (Cont.)
!
Linked list (free list)!
!
!
!
Grouping !
!
!
!
Cannot get contiguous space easily!
No waste of space!
First free block contains address of n
free blocks!
The n-th block therein contains
address of another n free blocks, … !
Counting!
!
!
30
Free blocks might be contiguous!
Keep starting block # and length
Implementing Directories (1)
A UNIX V7 directory entry.
Figure 4-14. (a) A simple directory containing fixed-size entries with
the disk addresses and attributes in the directory entry. (b) A
directory in which each entry just refers to an i-node.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Shared Files (1)
Figure 4-16. File system containing a shared file.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Shared Files (2)
Figure 4-17. (a) Situation prior to linking. (b) After the link is
created. (c) After the original owner removes the file.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Outline
!
!
!
!
!
Abstract view of hard disk!
Disk space allocation and management!
Efficiency and Performance!
Recovery!
NFS
20
Caching (1)
Figure 4-28. The buffer cache data structures.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Caching (2)
•
Some blocks, such as i-node blocks, are rarely
referenced two times within a short interval.
!
•
Consider a modified LRU scheme, taking two
factors into account:
•Is the block likely to be needed again soon?
•Is the block essential to the consistency of the file system?
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
Outline
!
!
!
!
!
Abstract view of hard disk!
Disk space allocation and management!
Efficiency and Performance!
Recovery!
NFS
20
The MS-DOS File System (1)
Figure 4-31. The MS-DOS directory entry.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
The UNIX V7 File System (1)
Figure 4-33. A UNIX V7 directory entry.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
The UNIX V7 File System (2)
Figure 4-34. A UNIX i-node.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
The UNIX V7 File System (3)
Figure 4-35. The steps in looking up /usr/ast/mbox.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
File system structures in memory
!
!
Mount table: contains info. about each mounted volume!
In-memory directory-structure cache: contains recent
accessed directory info.!
!
!
!
!
For a directory that is a mounting point, contains flag
indicating it’s a mount point, and a pointer to an entry in
mount table!
System-wide open-file table: a copy of i-node for each
open file!
Per-process open-file table: contains pointer to
appropriate entry in system-wide open-file table!
Buffer: hold file system blocks being read from disk or
written to disk
12
Supporting file system interface: open a file
!
!
!
!
Program issues system call, open(), passing a file
name!
Logic file system (handler of open()) searches
system-wide open-file table for the file, if not found,
search directory structure for the file, cache directory
info, copy file’s PCB into system-wide open-file table!
In per-process open-file table, creates an entry for the
file, to store pointer to system-wide open-file table
entry, current location pointer, access mode info.!
Return a pointer to the per-process open-file table,
i.e., file descriptor in Unix, or file handler in Windows
14
Open/Read a file
15
Outline
!
!
!
!
!
!
File system introduction!
File system implementation!
Disk space allocation and management!
Efficiency and Performance!
Recovery!
NFS
16
Directory:
!
Contains information about files!
!
!
!
File Name!
File type!
File Organisation!
!
!
!
For systems that support different organizations!
Attributes, ownership!
Location: !
!
!
!
!
Volume: Indicates device on which file is stored!
Starting Address!
Size Used : Current size of the file in bytes, words, or blocks!
Size Allocated : The maximum size of the file
Operations Performed on a Directory
!
A directory system should support a number of
operations including:!
!
!
!
!
!
Search!
Create files!
Deleting files!
Listing directory!
Updating directory
Directory Implementation
!
!
Linear list of file names with pointer to the data blocks.!
! simple to program!
! time-consuming to search!
! Sorted list? Tree structure?!
Hash Table – linear list with a hash table!
! hash table takes a value computed from file name and
returns a pointer to the file name in a linear list!
! decreases directory search time!
! collisions – situations where two file names hash to the
same location
19
Outline
!
!
!
!
!
!
File system introduction!
File system implementation!
Disk space allocation and management!
Efficiency and Performance!
Recovery!
NFS
31
Efficiency and Performance
!
Efficiency dependent on:!
!
!
!
disk allocation and directory algorithms!
types of data kept in file’s directory entry!
Performance!
!
!
!
32
disk cache – separate section of main memory for
frequently used blocks!
free-behind and read-ahead – techniques to optimize
sequential access!
improve PC performance by dedicating section of memory
as virtual disk, or RAM disk
Page Cache
!
!
!
!
A page cache caches pages rather than disk blocks
using virtual memory techniques!
!
Memory-mapped I/O uses a page cache!
!
Routine I/O through the file system uses the buffer
(disk) cache!
!
This leads to the following figure
33
I/O Without a Unified Buffer Cache
34
Outline
!
!
!
!
!
!
File system introduction!
File system implementation!
Disk space allocation and management!
Efficiency and Performance!
Recovery!
NFS
35
Recovery
!
Consistency checking – compares data in directory
structure with data blocks on disk, and tries to fix
inconsistencies
!
Use system programs to back up data from disk to
another storage device (floppy disk, magnetic tape,
other magnetic disk, optical)
!
Recover lost file or disk by restoring data from
backup
36
Log Structured File Systems
!
!
Log structured (or journaling) file systems record
each update to file system as a transaction!
All transactions are written to a log!
!
!
!
Transactions in the log are asynchronously written
to file system!
!
!
37
A transaction is considered committed once it is
written to log!
However, file system may not yet be updated!
When file system is modified, the transaction is
removed from log!
If file system crashes, all remaining transactions in
log must still be performed
Outline
!
!
!
!
!
!
File system introduction!
File system implementation!
Disk space allocation and management!
Efficiency and Performance!
Recovery!
NFS
38
The Sun Network File System (NFS)
!
An implementation and a specification of a software
system for accessing remote files across LANs (or
WANs)
!
The implementation is part of the Solaris and SunOS
operating systems running on Sun workstations using
an unreliable datagram protocol (UDP/IP protocol and
Ethernet
39
NFS (Cont.)
!
!
Interconnected workstations viewed as a set of independent
machines with independent file systems, which allows
sharing among these file systems in a transparent manner!
A remote directory is mounted over a local file system
directory!
!
!
Mounted directory looks like an integral subtree of local file
system, replacing the subtree descending from the local directory!
Specification of remote directory for mount operation is
nontransparent: host name of remote directory has to be provided!
!
!
Files in the remote directory can then be accessed in a transparent
manner!
Subject to access-rights accreditation, potentially any file
system (or directory within a file system), can be mounted
remotely on top of any local directory
40
NFS (Cont.)
!
!
!
NFS is designed to operate in a heterogeneous
environment of different machines, operating
systems, and network architectures; the NFS
specifications independent of these media!
This independence is achieved through the use of
RPC primitives built on top of an External Data
Representation (XDR) protocol used between two
implementation-independent interfaces!
NFS specification distinguishes between the services
provided by a mount mechanism and the actual
remote-file-access services
41
NFS Protocol
!
Provides a set of remote procedure calls for remote file
operations. The procedures support the following operations:!
!
!
!
!
!
!
!
!
searching for a file within a directory !
reading a set of directory entries !
manipulating links and directories !
accessing file attributes!
reading and writing files!
NFS servers are stateless; each request has to provide a full
set of arguments
!
(NFS V4 is just coming available – very different,
stateful)!
Modified data must be committed to the server’s disk before
results are returned to the client (lose advantages of caching)!
The NFS protocol does not provide concurrency-control
mechanisms
42
Three Major Layers of NFS Architecture
!
!
UNIX file-system interface (based on the open, read,
write, and close calls, and file descriptors)!
Virtual File System (VFS) layer – distinguishes local
files from remote ones, and local files are further
distinguished according to their file-system types!
!
!
!
The VFS activates file-system-specific operations to
handle local requests according to their file-system types !
Calls the NFS protocol procedures for remote requests!
NFS service layer – bottom layer of the architecture!
!
43
Implements the NFS protocol
Schematic View of NFS Architecture
44
NFS Path-Name Translation
!
Performed by breaking the path into component
names and performing a separate NFS lookup call for
every pair of component name and directory vnode
!
To make lookup faster, a directory name lookup
cache on the client’s side holds the vnodes for
remote directory names
45
NFS Remote Operations
!
!
!
Nearly one-to-one correspondence between regular UNIX
system calls and the NFS protocol RPCs (except opening
and closing files)!
NFS adheres to the remote-service paradigm, but
employs buffering and caching techniques for the sake of
performance !
File-blocks cache – when a file is opened, the kernel
checks with the remote server whether to fetch or
revalidate the cached attributes!
!
!
!
Cached file blocks are used only if the corresponding cached
attributes are up to date!
File-attribute cache – the attribute cache is updated
whenever new attributes arrive from the server!
Clients do not free delayed-write blocks until the server
confirms that the data have been written to disk
46