ppt - Virginia Tech

advertisement
A Fast File System for Unix
Marshall K. Mckusick, William N. Joy,
Samual J. Leffler and Robert S. Fabry
Computer Systems Research Group, UCB
CS 5204: Operating Systems, Virginia Tech
Presented By:
Parang Saraf
About the Paper
• Considered as one of the most fundamental papers in
operating systems
• Have been cited around 930 times
• Describes a new file system
2
Traditional File System
• File System developed at Bell Laboratories
• A file system is described by its Super-Block
o Number of Data Blocks
o Count of maximum number of files
o Pointer to free list (linked list to all free blocks)
• Disk drive is divided into partitions
o Each disk partition may contain one file system
o A file system never spans multiple partitions
3
Traditional File System
4
Traditional File System – Inode
• Each file has a descriptor associated with it – Inode.
• Information includes:
o Ownership of the file
o Time stamps marking last modification and access time
o Array of indices pointing to the data blocks
 Direct Blocks – 8
 Indirect Blocks – Singly, Doubly and Triply
5
Traditional File System – Inode
6
Traditional File System – Inode
7
Traditional File System – Problem
• Inode information segregated from Data
o Long seek time from inode to its data
• Files in single directory are not typically allocated consecutive
slots for inode information
o Many non-consecutive blocks of inodes are accessed when executing
operations on inodes of several files in a directory
• Sub-optimum allocation of data blocks
o Small Block size – 512 bytes
o Many Seeks – Next sequential block is not on the same cylinder
o Limited read-ahead
8
Old File System
• Developed at Berkeley
• Increased Throughput
•
Changing the basic block size from 512 bytes to 1024 bytes
•
Each disk transfer accessed twice as much data
•
Less number to indirect blocks used
• Increased Reliability
•
Staging modifications to critical file system information so that they could
either be completed or repaired cleanly after a crash
9
Old File System – Problem
• Old file system was still using just 4% of disk bandwidth
• Main problem – Scrambled Free List
10
Old File System – Problem
• Old file system was still using just 4% of disk bandwidth
• Main problem – Scrambled Free List
o Initially ordered for optimal access
o Scrambled because files were created and removed
o Eventually becomes entirely random – blocks allocated randomly
o On creation provides transfer rates up to 175 kbps
o Rate deteriorates to 30 kbps after a few weeks of moderate use
• Possible Solution – Dump, rebuild and restore / Fragmentation
11
New File System
• Each disk drive contains one or more file systems
• A File System is described by its super-block, located at
the beginning of the disk partition
• Super-block is replicated to protect against catastrophic
loss
• Block size is any power of two >= 4096 bytes
o Decided at the time of file system creation and can’t be changed
o File Systems can have different block sizes
12
New File System – Cylinder Groups
• Comprises of one or more consecutive cylinders
13
New File System – Cylinder Groups
• Comprises of one or more consecutive cylinders
• Disk partition is divided into one or more cylinder groups
• Has associated book-keeping information:
o A redundant copy of super-block
o Space for inodes
o A bit map describing available blocks – replaces free list
o Summary information describing usage of data blocks
14
New File System – Cylinder Groups
• Contains static number of inodes:
o Allocated at file system creation time
o Default policy – one inode for each 2048 bytes
• Book-keeping information begins at varying offset from the
beginning of the cylinder group
o Redundant information spirals down into the cylinder
o Any single track, cylinder or platter can be lost without losing copies of the
super-block
15
New File System – Structure
16
New File System – Key Contributions
• Optimizing storage utilization
• File System Parameterization
• Layout Policies
17
Optimizing Storage Utilization
• New 4096 size blocks – transfers 4 times more
• Problem with large blocks:
o Wasted space due to small files
18
Optimizing Storage Utilization
• Solution:
o Divide the 4096 block into 2, 4 or 8 fragments to accommodate small files
o Fragment size is specified at the time file system is created
o Block map records the space available at fragment level
19
Optimizing Storage Utilization
• Free List vs Bitmap
20
Optimizing Storage Utilization
• Space allocation:
o Space is allocated when a program does a write system call
o Three possible conditions:
 Enough space left in an already allocated block or fragment
 File contains no fragmented blocks – allocate new blocks and fragments
 File contains one or more fragmented blocks but has insufficient space
to hold new data – new block is allocated, old fragments are copied and
new fragments are appended
21
Optimizing Storage Utilization
• Free space reserve
o Minimum acceptable percentage of file system blocks that should
be free – 90%
o Only system administrator can allocate blocks after that
o Important for the layout policies to be effective
o After this the file system throughput is cut in half because of the
inability to localize blocks in a file
22
Optimizing Storage Utilization
• Wasted space comparison
o Space wasted by 4096/1024 byte new file system is same as 1024
byte Old File System
o New file system uses less space for indexing large files
o Uses same amount of space for small files
o Free space reserve should also be counted as wasted space
23
File System Parameterization
• Optimum block allocation based on hardware parameters
o Speed of Processor
o Hardware support for mass storage transfers
o Characteristics of the mass storage devices
• Blocks are allocated on the same cylinder
• Block allocation depends on whether the processor has
an input/output channel or not
24
File System Parameterization
Accessing which data is faster?
25
File System Parameterization
Accessing which data is faster?
Depends whether processor has I/O channel or not
26
File System Parameterization
• Rotationally Optimal Blocks
o Processors without I/O channels must field an interrupt and then prepare for a
new disk transfer
o Disk rotates during this time
o Place blocks such that disk rotation is taken into account before the start of a
new disk transfer operation
• Cylinder group summary information includes count of
blocks based on different rotational positions – 8 positions
•
Super-block contains a vector of lists called as
Rotational Layout Tables – Used by system when
allocating new blocks
27
File System Parameterization
28
Layout Policies
• Layout policies divided into two distinct parts:
o Global Policies
o Local Allocation Routines
• Two allocable resources:
o
Inodes
o
Data Blocks
29
Layout Policies
• Global Policies
o Uses file system wide summary information to make decisions
regarding the placement of new inodes and data blocks
o Tries to localize data that is concurrently accessed while spreads
out unrelated data
o Inodes:
 Places all inodes of files in a directory in the same cylinder group
 A new directory is placed in a cylinder group that has a greater than
average number of free inodes and the smallest number of directories
already in it – ensures that files are distributed throughout the disk
30
Layout Policies
• Global Policies
o Data Blocks:
 Tries to place all data blocks for a file in the same cylinder group
 None of the cylinder groups should ever become completely full
 Heuristic Solution – redirect block allocation to a different cylinder group
when a file exceeds 48 kb and at every MB thereafter
 Ensures that cost of one long seek per MB is small
 New cylinder groups are chosen from those cylinder groups that have a
greater than average number of free blocks left
 Finally it calls Local Allocation Routines for block allocation
31
Layout Policies
• Local Allocation Routines
o Allocates a free block as requested by the Global layout policies
o Uses a four level allocation
o First Level – use the next free block that is rotationally closest to the requested
block on the same cylinder
Cylinder 0
32
Layout Policies
• Local Allocation Routines
o Second Level – if there are no free blocks on the same cylinder, a free block
in the same cylinder group is selected
Cylinder 0
Cylinder Group
Cylinder 1
33
Layout Policies
• Local Allocation Routines
o Third Level – if the cylinder group is full, use the quadratic hash function to
hash the cylinder group number to find another cylinder group to look for a
free block
o Fourth Level – if the hash fails, use an exhaustive search on all cylinder
groups
o Quadratic Hash
o is used because of its speed in finding unused slots in nearly full hash tables
o File systems parameterized to maintain 10% free space rarely use this
34
Performance
• Measured Throughput
35
Performance
• List Directory command performance
o For large directories containing many directories, disk access for inodes is cut
by a factor of two
o For large directories containing only files, disk access for inodes is cut by a
factor of eight
• Both reads and writes are faster in new file system
o
Because larger block sizes are used
o
The overhead of allocating is more but cost per byte allocation is same
o
Reading rate is always at least as fast as writing rate

Writes are slower for 4096 byte block as compared to 8096 byte block

In old file system writing was 50% faster than reading
36
New File System - Limitations
• Limited by memory to memory copy operations required
to move data from disk buffers in the system’s address
space to data buffers in the user’s address space
o Buffer alignment of both address space
• One block is allocated to a file at a time
o Pre-allocate several blocks at once and releasing unused ones on file closing
37
Functional Enhancements
• Long File Name
• File Locking
• Symbolic Links
• Rename
• Quotas
38
Long File Name
• Maximum length of file name is 255 characters
• Directories are allocated 512 byte units called chunks
• Chunks are broken into Directory Entries:
o Contains information necessary to map the name of file with inode
o First three fields are fixed length – inode number, size of entry and length of
file name
39
File Locking
• Hard Lock – always enforced when a program tries to
access a file
• Advisory shared or exclusive locks – requested by the
programs
• System administrator privilege can override locks
• No deadlock detection is attempted
40
Symbolic Links
• A symbolic link is implemented as a file that contains a
pathname
• Pathname can be relative or absolute
• On encountering a symbolic link while interpreting a
component of a pathname, the contents of the symbolic
link is prepended to the rest of the pathname
41
Rename
• Old file system required three system calls for renaming
• Target file could be left with temporary name due to crash
• New rename system call added that guarantees the
existence of the target name
• Renaming works both on directory and files
42
Quotas
• Old file system – any single user can allocate all the
available space in the file system
• Quota restricts the amount of file system resources that a
user can obtain
• Sets limits to both inodes and number of disk blocks
• Hard and soft limits
43
Key Take-Away points
• Substantially higher throughput rates – large block size
• Flexible allocation policies
o Better locality of reference
o Less wastage
• Adapted to wide range of peripheral and processor
characteristics
44
References
• Presentation on “A Fast File System” by:
o Zhifei Wang : www.cs.pdx.edu/~walpole/class/cs533/spring2006/slides/191.ppt
o pdc-amd01.poly.edu/~wein/cs6243/ppts/fastfile.ppt
o Sean Mondesire and Subramanian Kasi :
www.cs.ucf.edu/courses/cop5611/spring05/item/FFS.ppt
o www.scs.ryerson.ca/~aabhari/File_System.ppt
• http://flylib.com/books/en/3.224.1.79/1/
• http://osr507doc.sco.com/en/HANDBOOK/graphics/harddisk.gif
45
Download