More on File Systems

advertisement
More on File Management
Chapter 12
File Management
•
•
•
•
•
•
provide file abstraction for data storage
guarantee, to the extend possible, that data in the file is valid
performance: throughput and response time
minimize the potential for lost or destroyed data: reliability
provide protection
API: create, delete, read, write files
File Naming
• files must be referable by unique names
• external names: symbolic
• in a hierarchical file system (UNIX) external names are given as
pathnames (path from the root to the file)
• internal names: i-node in UNIX (an index into an array of file
descriptors/headers for a volume)
• directory: translation from external to internal names (more than one
external name for an internal name is allowed)
• information about file is split between the directory and the file
descriptor (in UNIX all of it is stored in the file descriptor): size,
location on disk, owner, permissions, date created, date last
modified, date last access, link count (in UNIX)
Protection Mechanisms
• files are OS objects: unique names and a finite set of operations that
processes can perform on them
• protection domain is a set of {object,rights} where right is the
permission to perform one of the operations
• at every instant in time, each process runs in some protection domain
• in Unix, a protection domain is {uid, gid}
• protection domain in Unix is switched when running a program with
SETUID/SETGID set or when the process enters the kernel mode by
issuing a system call
• how to store all the protection domains ?
Protection Mechanisms (cont’d)
• Access Control List (ACL): associate with each object a list of all
the protection domains that may access the object and how
• in Unix ACL is reduced to three protection domains: owner, group
and others
• Capability List (C-list): associate with each process a list of objects
that may be accessed along with the operations
• C-list implementation issues: where/how to store them (hardware,
kernel, encrypted in user space) and how to revoke them
Secondary Storage
Management
• Space must be allocated to files
• Must keep track of the space available
for allocation
Preallocation
• Need the maximum size for the file at
the time of creation
• Difficult to reliably estimate the
maximum potential size of the file
• Tend to overestimated file size so as not
to run out of space
Methods of File Allocation
• Contiguous allocation
– Single set of blocks is allocated to a file at
the time of creation
– Only a single entry in the file allocation
table
• Starting block and length of the file
• External fragmentation will occur
Methods of File Allocation
• Chained allocation
– Allocation on basis of individual block
– Each block contains a pointer to the next block in
the chain
– Only single entry in the file allocation table
• Starting block and length of file
• No external fragmentation
• Best for sequential files
• No accommodation of the principle of locality
Methods of File Allocation
• Indexed allocation
– File allocation table contains a separate onelevel index for each file
– The index has one entry for each portion
allocated to the file
– The file allocation table contains block
number for the index
File Allocation
• contiguous: a contiguous set of blocks is allocated to a file at the
time of file creation





good for sequential files
file size must be known at the time of file creation
external fragmentation
chained allocation: each block contains a pointer to the next
one in the chain

consolidation to improve locality
indexed allocation: good both for sequential and direct
access (UNIX)
Free Space Management
• bitmap: one bit for each block on the disk




good to find a contiguous group of free blocks
small enough to be kept in memory
chained free portions: {pointer to the next one, length}
index: treats free space as a file
UNIX File System
• Naming
– External/Internal names, Directories
• Lookup
– File blocks  Disk blocks
• Protection
• Free Space Management
File Naming
• External names (used by the application)
– Pathname: /usr/users/file1
• Internal names (used by the OS kernel)
– I-node: file number/index on disk
File system
on disk
superblock
0
1
I-node area
( one I-node per file)
File-block area
Directories
• Files which store translation tables
(external names to internal names)
usr
usr
Root directory
(always I-node 2)
usr users
23
users 41
file1 87
/usr/users/file1 corresponds to I-node 87
File Content Lookup
• address table used to translate logical file
blocks into disk blocks
File with
i-node 87
File System
disk
0
1
2
Address Table
45
65
• address table stored in the I-node
85
45
65
85
File Protection
• ACL with three protection domains (file
owner, file owner group, others)
• Access rights: read/write/execute
• Stored in the I-node
Free Space Management
• Free I-nodes
– Marked as free on disk
– An array of 50 free I-nodes stored in the
superblock
• Free file blocks
– Stored as a list of 50- free block arrays
– First array stored in the superblock
In-Kernel File System Data
Structures
Application
fd=open(pathname,mode); /* fd = index in Per-Proc OFT */
for (..) read(fd,buf,size);
close(fd);
PCBs
OS Kernel
Per-process
Open File Table
I-node cache
Per-OS Open File Table
(offset in file, ptr to I-node)
Buffer cache
File system
on disk
0
1
File System Consistency
• a file system uses the buffer cache for performance reasons
• two copies of a disk block (buffer cache, disk) -> consistency
problem if the system crashes before all the modified blocks are
written back to disk
• the problem is critical especially for the blocks that contain control
information (meta-data): directory blocks, i-node, free-list
• Solution:
– write through meta-data blocks (expensive) or order of writeback is important
– ordinary file data blocks written back periodically (sync)
– utility programs for checking block and directory consistency
after crash
More on File System
Consistency
•
Example 1: create a new file
– Two updates: (1) allocate a free I-node; (2) create an entry in the directory
– (1) and (2) must be write-through (expensive) or (1) must be written-back
before (2)
– If (2) is written back first and a crash occurs before (1) is written back the
directory structure is inconsistent and cannot be recovered
• Example 2: write a new block to a file
– Two updates: (1) allocate a free block; (2) update the address table of the Inode
– (1) and (2) must be write-through or (1) must be written-back before (2)
– If (2) is written back first and a crash occurs before (1) is written back the Inode structure is inconsistent and cannot be recovered
Log-Structured File System (LFS)
• as memory gets larger, buffer cache size increases -> increase the
fraction of read requests which can be satisfied from the buffer
cache with no disk access
• conclusion: in the future most disk accesses will be writes
• but writes are usually done in small chunks in most file systems
(meta data for instance) which makes the file system highly
inefficient
• LFS idea (Berkeley): to structure the entire disk as a log
• periodically, or when required, all the pending writes (data and
metadata together) being buffered in memory are collected and
written as a single contiguous segment at the end of the log
LFS segment
•
•
•
•
contain i-nodes, directory blocks and data blocks, all mixed together
each segment starts with a segment summary
segment size: 512 KB - 1MB
two key issues:


how to retrieve information from the log
how to manage the free space on disk
File location in LFS
• the i-node contains the disk addresses of the file block as in the
standard UNIX
• but there is no fixed location for the i-node
• an i-node map is used to maintain the current location of each i-node
• i-node map blocks can also be scattered but a fixed checkpoint
region on the disk identifies the location of all the i-node map blocks
• usually i-node map blocks are cached in main memory most of the
time, thus disk accesses for them are rare
Segment cleaning in LFS
• LFS disk is divided in segments which are written sequentially
• live data must be copied out of a segment before the segment can be
re-written
• the process of copying data out of a segment: cleaning
• a separate cleaner thread moves along the log, removes old segments
from the end and puts live data into memory for rewriting in the next
segment
• as a result a LFS disk appears like a big circular buffer with the
writer thread adding new segments to the front and the cleaner thread
removing old segments from the end
• book-keeping is not trivial: i-node must be updated when blocks are
moved to the current segment
Download