Chapter 11: File System Implementation Objectives

advertisement
Chapter 11: File System
Implementation
Objectives
„ To describe the details of implementing local file systems and
directory structures
„ To describe the implementation of remote file systems
„ To discuss block allocation and free-block algorithms and trade-offs
Operating System Concepts – 7th Edition, Jan 1, 2005
11.2
Silberschatz, Galvin and Gagne ©2005
1
File Concept
„ The file system consists of two distinct parts: collection of files and
a directory structure.
„ The OS provides abstraction of physical properties of disks by
defining a logical storage unit, the file.
„ A file is a collection of related information that is recorded on
secondary storage.
„ Why files, why we don’t save info in the process itself?
z
Limited space
z
When the process terminates, the information is lost
z
Multiple processes must be able to access the information
concurrently.
Operating System Concepts – 7th Edition, Jan 1, 2005
11.3
Silberschatz, Galvin and Gagne ©2005
File Attributes
„ Name – only information kept in human-readable form, For the
convenience of human users.
„ Identifier – unique tag (number) identifies file within file system
„ Type – needed for systems that support different types
„ Location – Pointer to a device and the file location on the device.
„ Size – current file size
„ Protection – controls who can do reading, writing, executing
„ Time, date, and user identification – data for protection, security,
and usage monitoring
„ Information about files are kept in the directory structure, which is
maintained on the disk
z
Usually, a directory entry consists of a filename and its ID. The
ID locates the other attributes.
Operating System Concepts – 7th Edition, Jan 1, 2005
11.4
Silberschatz, Galvin and Gagne ©2005
2
File Operations
„ A file is an abstract data type. System calls in the OS implement basic
file operations:
„ Create: The file is created with no data (just to initialize attributes)
„ Open: Before using a file, a process must open it.
z
to allow the system to fetch the attributes and list of disk addresses
into main memory for rapid access on later calls.
„ Close: to free up internal table space
„ Read: read from the current pointer position
„ Write: Data are written to the file usually at the current position
„ Reposition within file (file seek)
„ Delete
„ Truncate
„ Other possible operations: append, rename, get and set file attributes
„ Combining the above operations to perform other file operation (copying)
Operating System Concepts – 7th Edition, Jan 1, 2005
Silberschatz, Galvin and Gagne ©2005
11.5
Directory Structure
„
A collection of nodes containing information about all files in a file system
(name. location, size)
„
A directory is a system file for maintaining the structure of the file system
Directory
Files
F1
F2
F3
F4
Fn
„
„
„
Both the directory structure and the files reside on disk (or a partition)
A disk can contain multiple file systems
Directories can be organized in several different ways.
Operating System Concepts – 7th Edition, Jan 1, 2005
11.6
Silberschatz, Galvin and Gagne ©2005
3
Operations Performed on Directory
„ Search for a file
„ Create a file
„ Delete a file
„ List a directory
„ Rename a file
Operating System Concepts – 7th Edition, Jan 1, 2005
11.7
Silberschatz, Galvin and Gagne ©2005
Issues for Directory Organization
„ Efficiency – locating a file quickly
„ Naming – convenient to users
z
two users can have same name for different files
z
the same file can have several different names
„ Grouping – logical grouping of files by properties, (e.g., all
Java programs, all games, …)
Operating System Concepts – 7th Edition, Jan 1, 2005
11.8
Silberschatz, Galvin and Gagne ©2005
4
Directory Structure
Single-Level Directory
„ A single directory for all users, in which all files are contained.
„ Was common in early days when there was only one user.
„ Simple and easy to find files (one place to look in)
„ Naming problem
„ Grouping problem
Operating System Concepts – 7th Edition, Jan 1, 2005
11.9
Silberschatz, Galvin and Gagne ©2005
Two-Level Directory
„ Separate directory for each user
„ Path name: defined by the username and the file name.
„ Each OS has specific format for the path name.
„ System files are searched through search path rather than copied to each user directory
„ Different users can have files with the same name
„ Efficient searching
„ No grouping capability
Operating System Concepts – 7th Edition, Jan 1, 2005
11.10
Silberschatz, Galvin and Gagne ©2005
5
Tree-Structured Directories
Operating System Concepts – 7th Edition, Jan 1, 2005
11.11
Silberschatz, Galvin and Gagne ©2005
Tree-Structured Directories (Cont)
„ Efficient searching
„ Grouping Capability (subdirectories)
„ Current directory (working directory): should contain most of the
files of interest to the process, else a user specify new directory
z
Special system calls (cd) are provided to change directories.
„ Absolute or relative path names.
z
Absolute – specify the name starting from the top (root)
directory.
Example:
/users/smith/project1/main.c
z
Relative – specify the name starting from the current directory.
z
Search path for command names.
Example:
main.c
Operating System Concepts – 7th Edition, Jan 1, 2005
11.12
Silberschatz, Galvin and Gagne ©2005
6
Tree-Structured Directories (Cont)
„
„
„
Creating a new file is done in current directory
Deleting a directory – two options:
z Require the directory to be empty before it can be deleted.
z Provide an option to delete the directory along with all of its
files and subdirectories.Delete a file
rm <file-name>
Creating a new subdirectory is done in current directory
mkdir <dir-name>
Example: if in current directory /mail
mkdir count
mail
prog
copy prt exp count
Deleting “mail” ⇒ deleting the entire subtree rooted by “mail”
Operating System Concepts – 7th Edition, Jan 1, 2005
11.13
Silberschatz, Galvin and Gagne ©2005
File-System Structure
„ File system resides permanently on secondary storage (disks)
„ For efficiency, I/O transfer between memory and a disk is
performed on the units of blocks
z
Each block has one or more sector, a sector is usually 512
bytes
„ OS imposes a file system to store and access data. 2 problems:
z
Define how the file system should look to the user. It involves
defining a file, its attributes, operations on a file, directory
structure. Î USER VIEW
z
Creating algorithms and data structure to map the logical file
system into the physical secondary-storage device. Î
SYSTEM VIEW
Operating System Concepts – 7th Edition, Jan 1, 2005
11.14
Silberschatz, Galvin and Gagne ©2005
7
File-System Structure (contd)
„ A file system is generally organized into layers:
z
I/O control – device drivers and interrupt handlers to transfer
info between main memory and the disk system
The
device driver translates a high level commands ( such
as get block 123) into hardware-specific instructions.
z
Basic file system – implements basic operations (commands).
z
File-organization module – knows about files and their logical
and physical blocks (location).
z
Logical file system – manages metadata, i.e. information
about the files.
Tracks
also the free space
Manages
the directory structure via a file control block
(FCB), which contains info about the file including
ownership, permissions, and location.
Operating System Concepts – 7th Edition, Jan 1, 2005
11.15
Silberschatz, Galvin and Gagne ©2005
Layered File System
Operating System Concepts – 7th Edition, Jan 1, 2005
11.16
Silberschatz, Galvin and Gagne ©2005
8
On-Disk File System Structures
„ File system structures that reside on disk generally include:
z
A boot control block (boot block, boot sector)
contains
information needed to boot the OS. If no OS, this
block is empty
Usually
z
at the beginning of each partition (disk)
A volume control block contains volume (partition)
information such as the number of blocks, size of blocks, freeblock count and free-block pointers.
(UFS:
superblock. NTFS: master file table.)
z
A directory structure per file system is used to organize files.
z
A file control block for each file contains details about the file,
including permissions, ownership, size, and location of data
blocks. (UFS: inode. NTFS: stored within the master file table.)
Operating System Concepts – 7th Edition, Jan 1, 2005
11.17
Silberschatz, Galvin and Gagne ©2005
A Typical File Control Block
Operating System Concepts – 7th Edition, Jan 1, 2005
11.18
Silberschatz, Galvin and Gagne ©2005
9
In-Memory File System Structures
„ Information about files that is kept in memory by the operating
system may include:
z
A mount table with information about each mounted volume.
z
A directory-structure cache.
z
The system-wide open-file table, with a copy of the FCB for
each open file, along with other information.
z
The per-process open-file tables, which contain pointers to
appropriate entries in the system-wide open-file table, and
other information.
„ The following slide illustrates some of the in-memory structures
kept by the operating system.
z
Figure (a) refers to opening a file.
z
Figure (b) refers to reading a file.
Operating System Concepts – 7th Edition, Jan 1, 2005
11.19
Silberschatz, Galvin and Gagne ©2005
In-Memory File System Structures
Operating System Concepts – 7th Edition, Jan 1, 2005
11.20
Silberschatz, Galvin and Gagne ©2005
10
Directory Implementation
„
„
Linear list of file names with pointer to the data blocks.
z
simple to program
z
time-consuming to execute Î requires linear search
To create a new file, the directory must be searched first to make
sure that no existing file has the same name.
Also the same for deleting a file.
Hash Table – linear list stores the directory entries but with hash data
structure. (hash function takes filename as input and returns a pointer to the
filename in the list)
z
decreases directory search time
z
collisions – situations where two file names hash to the same location
z
Another problem is fixed size of hash tables. If the directory size must
increase, a new hash function is needed. (ex: mod 64 to mod 128)
Operating System Concepts – 7th Edition, Jan 1, 2005
11.21
Silberschatz, Galvin and Gagne ©2005
File Implementation
(Allocation Methods)
„ Disk blocks must be allocated to files. The main problem is how to
allocate space to files so that disk space is utilized effectively and
files can be accessed quickly.
z
keeping track of which disk blocks go with which file
„ There are three major methods of allocating disk space to files:
z
Contiguous allocation
z
Linked allocation
z
Indexed allocation
Operating System Concepts – 7th Edition, Jan 1, 2005
11.22
Silberschatz, Galvin and Gagne ©2005
11
Contiguous Allocation
„
„
„
„
„
„
„
„
Each file occupies a set of contiguous blocks on the disk
Provides efficient direct access.
z Accessing block b+1 after block b requires no head movement, so the number
of head seeks is minimal.
Simple addressing scheme: only starting location (block #) and length (number of
blocks) are required
z The directory entry for each file indicates the address of the starting block and
the number of blocks required
Both sequential and direct access can be supported
A dynamic storage-allocation problem.
z First fit and best fit are the most common used strategies.
z External fragmentation can be a problem.
Another problem is knowing how much space is needed for the file (file size) when it
is created
Files cannot grow, so if the allocated space is not enough any more:
z Terminate the user program, with an appropriate error message
z Find larger hole and copy the file into it.
Pre-allocation of space is generally required if the file size is known in advance,
However, the file might reach its final size over long period (may be years) which
can cause significant internal fragmentation.
Operating System Concepts – 7th Edition, Jan 1, 2005
11.23
Silberschatz, Galvin and Gagne ©2005
Contiguous Allocation of Disk Space
Operating System Concepts – 7th Edition, Jan 1, 2005
11.24
Silberschatz, Galvin and Gagne ©2005
12
Linked Allocation
„ Each file is a linked list
of fixed-size disk blocks.
„ Blocks may be scattered
anywhere on the disk.
„ Allocate blocks as
needed as the file
grows, wherever they
are available on the disk
„ The directory contains a
pointer to the first and
last blocks of the file
z
Each block contains
a pointer to the next
block (not accessible
by users)
Operating System Concepts – 7th Edition, Jan 1, 2005
11.25
Silberschatz, Galvin and Gagne ©2005
Linked Allocation (Cont.)
„ Simple – only need to keep the starting address of each file.
„ No external fragmentation. Any free block can be used to satisfy a
request for more space.
„ Creating a file is easy, declared of size zero (null pointer to the first
block) then grow easily as long as free blocks are available
„ No efficient direct access.
z
Effective for sequential access
„ Pointers saved in the blocks consume some space
„ Use of clusters (a set of blocks which the system deals with as a unit)
z
Decreases overhead of pointers (4 blocks = 1 cluster needs 1 ptr)
z
Increases throughput (fewer head seeks).
z
Increases internal fragmentation.
„ Reliability: if a pointer is lost, blocks (clusters) can’t be traversed
Operating System Concepts – 7th Edition, Jan 1, 2005
11.26
Silberschatz, Galvin and Gagne ©2005
13
Linked Allocation with FAT
Variation: File-Allocation Table (FAT)
A section of disk at the beginning is
set a side to store the table.
„ FAT: One entry per block or cluster,
and is indexed by block no. Link
pointers only are kept in the FAT.
„
„
z
taking the pointer word from each
disk block and putting it in the FAT
table in memory
The directory entry contains the block
no. of the first block
„ Special end-of-file is saved in the
last pointer.
„
„
The FAT should be cached in
memory to reduce head seeks.
„
Used by MS-DOS
z
Random access is much easier
Operating System Concepts – 7th Edition, Jan 1, 2005
11.27
Silberschatz, Galvin and Gagne ©2005
Indexed Allocation (i-node)
„ Linked allocation solves the external fragmentation and size-
declaration problem, but it can’t support efficient direct access with
the absence of FAT
„ Indexed allocation brings all pointers together into the index block.
„ Each file has its own index block, which is an array of disk-block
addresses
„ The directory contains the address of the index block, from which
the ith block can be accessed directly
„ Logical view.
5
7
13
9
6
Block 5
Block7
Block 13
Block 9
Block 6
index block
Operating System Concepts – 7th Edition, Jan 1, 2005
11.28
Silberschatz, Galvin and Gagne ©2005
14
Example of Indexed Allocation
Operating System Concepts – 7th Edition, Jan 1, 2005
11.29
Silberschatz, Galvin and Gagne ©2005
Indexed Allocation (cont.)
„ More efficient direct access than linked allocation.
„ No external fragmentation. Any free block can be used to satisfy a
request for more space.
„ Suffers from a wasted space due to the use of index block. (pointer
overhead is larger)
z
How large the index block should be? What if a file is too large
for one index block?
Linked
scheme: use one block. To allow larger files, link
several index blocks
Multilevel
index: use a first-level index block to point to a
second-level index blocks
Combined
scheme: used in UNIX.
Operating System Concepts – 7th Edition, Jan 1, 2005
11.30
Silberschatz, Galvin and Gagne ©2005
15
Indexed Allocation (Cont.) – Multilevel Index
M
outer-index
index table
Operating System Concepts – 7th Edition, Jan 1, 2005
11.31
file
Silberschatz, Galvin and Gagne ©2005
Combined Scheme: UNIX (4K bytes per block)
Operating System Concepts – 7th Edition, Jan 1, 2005
11.32
Silberschatz, Galvin and Gagne ©2005
16
Efficiency and Performance
„
Disks are the main bottleneck in a system performance, so they need to be
improved in terms of efficiency and performance
„
Efficiency depends on:
z
disk allocation and directory algorithms
z
types of data kept in file’s directory entry
„
Clustering reduces head seeks and improve file transfer
Do we need to keep last access, last write dates for example. If yes we
need to update these fields whenever a file is read or written
Pointer size.
Performance: it can be improved even after the algorithms are selected
z
A buffer (disk) cache can be used to keep frequently used file blocks in mem.
z
Page cache: cache file data as pages rather than as system blocks.
z
Synchronous and asynchronous writes
z
free-behind and read-ahead – techniques to optimize sequential access
Memory mapped files can improve performance through page caching
Operating System Concepts – 7th Edition, Jan 1, 2005
11.33
Silberschatz, Galvin and Gagne ©2005
Recovery
„ Care must be taken to ensure that system failure does not result in data
loss or data inconsistency.
„ Consistency checking – compares data in directory structure kept in
memory with data blocks on disk, and tries to fix inconsistencies
z
performing a backup on an active file system might lead to
inconsistency. (If files and directories are being added, deleted, and
modified during the dumping process, the resulting backup may be
inconsistent )
z
Example: fsck on Unix or chkdsk in DOS
„ Use system programs to back up data from disk to another storage
device (floppy disk, magnetic tape, other magnetic disk, optical)
z
Full backup: should the entire file system be backed up or only part
of it?
it
is usually desirable to back up only specific directories and
everything in them rather than the entire file system.
z
Incremental backup: back up files that have not changed since the
last backup
„ Recover lost file or disk by restoring data from backup
Operating System Concepts – 7th Edition, Jan 1, 2005
11.34
Silberschatz, Galvin and Gagne ©2005
17
Log Structured File Systems
„
„
„
„
„
„
„
We have faster CPUs, faster and larger memories, larger but SLOW disks
writes are done in very small chunks Î very inefficient due to high seek time
z Ex: creating a file requires the i-node for the directory, the directory block,
the i-node for the file, and the file itself must all be written
The goal of LFS is to achieve the full bandwidth of the disk, even in the face of a
workload consisting in large part of small random writes
Log structured (or journaling) file systems record each small update to the file
system as a transaction
z Used in NTFS, optional in Solaris UNIX
All operations for a transactions are written to a log (the disk is viewed as a log)
z A transaction is considered committed once all of its operations have been
successfully written to the log. Then the system call can return to the user
z However, the file system may not yet be updated
The transactions in the log are asynchronously written to the file system
z When the file system has been successfully updated, the transactions are
removed from the log.
z After a system crash, all remaining committed transactions in the log are
performed. Uncommitted transactions are rolled back.
Summary: all writes are initially buffered in memory, and periodically all the
buffered writes are written to the disk in a single segment, at the end of the log
Operating System Concepts – 7th Edition, Jan 1, 2005
11.35
Silberschatz, Galvin and Gagne ©2005
RAID Structure
„
RAID – multiple disk drives provides reliability via redundancy.
„
distributes data across several physical disks which look to the operating
system and the user like a single logical disk
„
Idea: Use many disks in parallel to increase storage bandwidth, improve
reliability
„
z
Files are striped across disks
z
Each stripe portion is read/written in parallel
z
Bandwidth increases with more disks, but more error prone
RAID is arranged into six different levels.
z
RAID 0 distributes data across several disks in a way which gives improved
speed and full capacity, but all data on all disks will be lost if any one disk fails.
z
RAID 1 (mirrored disks) uses two (possibly more) disks which each store the
same data, so that data is not lost so long as one disk survives
z
RAID 5 combines three or more disks in a way that protects data against loss of
any one disk
Operating System Concepts – 7th Edition, Jan 1, 2005
11.36
Silberschatz, Galvin and Gagne ©2005
18
RAID (cont)
„ Several improvements in disk-use techniques involve the use of
multiple disks working cooperatively.
„ key concepts in RAID:
z
mirroring, the copying of data to more than one disk;
z
striping, the splitting of data across more than one disk;
Used
for reliability purposes
Used
for performance. More disks in the array means
higher bandwidth, but greater risk of data loss
z
error correction, where redundant data is stored to allow
problems to be detected and possibly fixed.
Operating System Concepts – 7th Edition, Jan 1, 2005
11.37
Silberschatz, Galvin and Gagne ©2005
RAID Levels
„
RAID 0: Striped set without parity:
z
„
Provides improved performance and additional
storage but no fault tolerance. Any disk failure
destroys the array, which becomes more likely
with more disks in the array
RAID1: Mirrored set without parity.
z
Provides fault tolerance from disk errors and
single disk failure
„
RAID2: Redundancy through Hamming code
RAID3: Striped with interleaved parity
„
RAID4: striped with Block level parity.
„
RAID5: Striped with distributed parity
„
z
z
z
z
„
Dedicated disk for parityÎ writing bottleneck
Dedicated disk for parityÎ writing bottleneck
Each disk have some blocks that hold the
corresponding blocks parity
Upon drive failure, any subsequent reads can be
calculated from the distributed parity such that the
drive failure is masked from the end user
RAID6: Striped set with dual parity
z
each group of blocks have two distributed parity
blocks that are distributed across the disks
z
Provides fault tolerance from two drive failures
Operating System Concepts – 7th Edition, Jan 1, 2005
11.38
Silberschatz, Galvin and Gagne ©2005
19
RAID (0 + 1) and (1 + 0)
Operating System Concepts – 7th Edition, Jan 1, 2005
11.39
Silberschatz, Galvin and Gagne ©2005
Fast File System
„ The original Unix file system had a simple, straightforward
implementation
z
Easy to implement and understand
z
But very poor utilization of disk bandwidth (lots of seeking)
„ BSD Unix folks did a redesign (mid 80s) that they called the Fast
File System (FFS)
z
„
Improved disk utilization, decreased response time
Now the FS from which all other Unix FS’s have been compared
„ Good example of being device-aware for performance
Operating System Concepts – 7th Edition, Jan 1, 2005
11.40
Silberschatz, Galvin and Gagne ©2005
20
Data and Inode Placement
„ Original Unix FS had two placement problems:
1.
2.
z
Data blocks allocated randomly in aging file systems
Blocks for the same file allocated sequentially when FS is
new
As FS “ages” and fills, need to allocate into blocks freed up
when other files are deleted
Problem: Deleted files essentially randomly placed
So, blocks for new files become scattered across the disk
Inodes allocated far from blocks
All inodes at beginning of disk, far from data
Traversing file name paths, manipulating files, directories
requires going back and forth from inodes to data blocks
Both of these problems generate many long seeks
Operating System Concepts – 7th Edition, Jan 1, 2005
11.41
Silberschatz, Galvin and Gagne ©2005
Cylinder Groups
„ BSD FFS addressed these problems using the notion of a cylinder
group
z
Disk partitioned into groups of cylinders
z
Data blocks in same file allocated in same cylinder
z
Files in same directory allocated in same cylinder
z
Inodes for files allocated in same cylinder as file data blocks
„ Free space requirement
z
To be able to allocate according to cylinder groups, the disk
must have free space scattered across cylinders
z
10% of the disk is reserved just for this purpose
Operating System Concepts – 7th Edition, Jan 1, 2005
11.42
Silberschatz, Galvin and Gagne ©2005
21
Other Problems
„ Small blocks (1K) caused two problems:
z
Low bandwidth utilization
z
Small max file size (function of block size)
„ Fix using a larger block (4K)
z
Very large files, only need two levels of indirection for 2^32
z
Problem: internal fragmentation
z
Fix: Introduce “fragments” (1K pieces of a block)
„ Problem: Media failures
z
Replicate master block (superblock)
Operating System Concepts – 7th Edition, Jan 1, 2005
11.43
Silberschatz, Galvin and Gagne ©2005
End of Chapter 11
22
Download