Chapter 11 slides

advertisement
COMPUTER SYSTEMS
An Integrated Approach to Architecture and Operating Systems
Chapter 11
File System
©Copyright 2008 Umakishore Ramachandran and William D. Leahy Jr.
11.1 Attributes
• Attributes associated with a file are referred to as
metadata
• Metadata represents space overhead
• Typical attributes
–
–
–
–
–
–
–
–
Name
Alias
Owner
Creation time
Last write time
Access rights
Privileges: Read, Write, Execute
Size
11.1 Attributes - Name
• Initially each file had a name and a list of
names was kept in a directory file
• As file system capacity increased a second
level was added
– Top level directory containing names of bottom
level directories
– Bottom level directories containing names of files
11.1 Attributes - Name
• Eventually
Directory
Directory
Directory
Music Files
Users
Eche’s music folder
Bruce Springsteen
Secret garden
Born to run
I’m on fire
Billy Joel
We didn’t start the fire
Piano Man
Uptown girl
Tupac Shakur
Changes
California love
11.1 Attributes - Name
• Implemented as a tree structure
/
users
students
staff
faculty
rama
foo
11.1 Attributes - Name
• Filename Extensions
– Sometimes mandatory e.g. DEC TOPS-10
– Sometimes typical but not mandatory e.g.
Windows
– Sometimes optional
• System uses extension to know what
application to launch to appropriately handle
file in normal case
11.1 Attributes - Alias
• Aliases
– May be at actual file level
• Linux: ln foo bar
• Creates a new directory entry 'alias' with same status as
'original
i-node access rights hard links size creation time name
3193357 -rw------2 rama
80
Jan 23
18:30 bar
3193357 -rw------2 rama
80
Jan 23
18:30 foo
– May be at level of names (same as shortcuts)
• Linux: ln –s fox box
• Creates a link entry named bar which contains foo
i-node access rights hard links size creation time name
3193495 lrwxrwxrwx
1 rama
3
Jan 23 18:52 box -> fox
3193357 -rw------1 rama
80
Jan 23 18:30 fox
11.1 Attributes - Links
Hard Links
• Efficient…goes right to file
• Do not contain other file
name(s)
• Hard link to a directory may
lead to circular lists
• For this reason, Linux does
not allow creating hard links
to directories
Soft Links
• Improves usability by
indicating actual file name
• Less efficient have to go to
directory to look at link to
go to a directory to then go
to file
11.1 Attributes - Versions
• Versions
– Today, in typical operating systems writing to an
exisiting file overwrites the file
– Some operating systems allow versioning
• Requires purge mechanism
• Very useful
• Sometimes annoying
test.data;4
11.1 Attributes – Access Rights
• Access Rights specify who can access a file and
what they can do to the file
• Ideally one should be able to specify access
rights by user but this requires a lot of
metadata
• As a compromise Linux breaks down rights by
user, group and other
• Typical privleges include: read, write, execute,
change ownership, change privileges
11.1 Attributes
• Visible metadata for a Linux file
rwxrw-r-- 1 rama fac 2364 Apr 18 19:13 foo
–
–
–
–
–
–
–
–
–
User permissions: rwx
Group permissions: rwOther permissions: r—
Hard links: 1
Owner: rama
Group: fac
Size: 2364
Creation date and time: Apr 18 19:13
File name: foo
11.1 Attributes
Attribute
Name
Alias
Meaning
Name of the file
Other names that exist
for the same physical
file
Elaboration
Attribute set at the time of creation or renaming
Attribute gets set when an alias is created; system such as
Unix provide explicit commands for creating aliases for a
given file; Unix supports aliasing at two different levels
(physical or hard, and symbolic or soft)
Owner
Usually the user who
created the file
Creation time
Time when the file was
created first
Time when the file was
last written to
Attribute gets set at the time of creation of a file; systems such
as Unix provide mechanism for the file’s ownership to be
changed by the superuser
Attribute gets set at the time a file is created or copied from
some other place
Attribute gets set at the time the file is written to or copied; in
most file systems the creation time attribute is the same as the
last write time attribute; Note that moving a file from one
location to another preserves the creation time of the file
Last write time
Privileges
 Read
 Write
 Execute
Size
The permissions or
access rights to the file
specifies who can do
what to the file;
Attribute gets set to default values at the time of creation of the
file; usually, file systems provide commands to modify the
privileges by the owner of the file; modern file systems such
NTFS provide an access control list (ACL) to give different
levels of access to different users
Total space occupied on Attribute gets set every time the size changes due to
the file system
modification to the file
11.1 Attributes
• Windows and some versions of UNIX have an
access control list for each file
• Such a practice allows for more flexability
• The tradeoff is an increase in the amount of
metadata stored
11.1 Attributes
Unix command
touch <name>
Semantics
Create a file with the name
<name>
Elaboration
Creates a zero byte file with the name <name> and a creation
time equal to the current wall clock time
mkdir <sub-dir>
Create a sub-directory <sub-dir>
The user must have write privilege to the current working
directory (if <sub-dir> is a relative name) to be able to
successfully execute this command
rm <name>
Remove (or delete) the file named
<name>
Remove (or delete) the subdirectory named <sub-dir>
Create a name <new> and make it
symbolically equivalent to the file
<orig>
Only the owner of the file (and/or superuser) can delete a file
Create a name <new> and make it
physically equivalent to the file
<orig>
Change the access rights for the
file <name> as specified in the
mask <rights>
Change the owner of the file
<name> to be <user>
Change the group associated with
the file <name> to be <group>
Create a new file <new> that is a
copy of the file <orig>
Even if the file <orig> is deleted, the physical file remains
accessible via the name <new>
mv <orig> <new>
Renames the file <orig> with the
name <new>
Renaming happens in the same directory if <new> is a file name;
if <new> is a directory name, then the file <orig> is moved into
the directory <new> preserving its name <orig>
cat/more/less <name>
View the file contents
rmdir <sub-dir>
ln –s <orig> <new>
ln <orig> <new>
chmod <rights> <name>
chown <user> <name>
chgrp <group> <name>
cp <orig> <new>
Only the owner of the <sub-dir> (and/or the superuse) can
remove the named sub-directory
This is name equivalence only; so if the file <orig> is deleted,
the storage associated with <orig> is reclaimed, and hence
<new> will be a dangling reference to a non-existent file
Only the owner of the file (and/or the superuser) can change the
access rights
Only superuser can change the ownership of a file
Only the owner of the file (and/or the superuser) can change the
group associated with a file
The copy is created in the same directory if <new> is a file
name; if <new> is a directory name, then a copy with the same
name <orig> is created in the directory <new>
11.2 Design Choices in implementing a
File System on a Disk Subsystem
• Some design constraints
– Four components of latency in doing I/O
operations to and from disk
• Seek time to a specific cylinder
• Rotational latency to get specific sector under
read/write head of disk
• Transfer time from/to disk controller buffer
• DMA transfer from/to controller buffer to/from system
memory
11.2 Design Choices in implementing a
File System on a Disk Subsystem
• Some design constraints
– Files are of arbitrary size
– Files may be accesses sequentially or randomly
– Files need to be allocated initially
– Files need to be able to grow
– Space should be used efficiently
11.2.1 Contiguous Allocation
• At file creation time a
set amount of space
is allocated (may
depend on file type)
• File cannot grow
beyond that size
• Fragmentation a
problem
11.2.1 Contiguous Allocation
• Free list
– Allocation may be
by first or best fit
– Requires periodic
compaction
11.2.2 Contiguous Allocation with
Overflow Area
• Modification of previous scheme to allow files
to expand into a designated overflow area
• Random access suffers due to overflow area
• Despite limitations has been used extensively
due to fast file access times
11.2.3 Linked Allocation
Free List
(In memory)
(free block)
Directory
(free block)
foo.txt
bar.jpg
0
(foo.txt)
(bar.jpg)
baz
0
(foo.txt)
(baz)
(baz)
0
0
• Files not stored
contiguously
• No compaction
required
• Sequential and
random access poor
• Susceptible to errors
11.2.4 File Allocation Table (FAT)
Directory
File name
/foo
/bar
FAT
start index
30
0
0
0
1
0
70
.
.
30
50
1
-1
70
0
.
.
1
-1
50
Free/busy
next
11.2.4 File Allocation Table (FAT)
• Divide disk into partitions
• Each partition has a FAT
• The directory just has a pointer into the
starting sector entry in the FAT for each file.
• Less chance for errors than linked allocation
• FAT becomes big so clustering and partitioning
may be necessary leading to other problems
11.2.5 Indexed Allocation
• Essentially breaks up FAT into one data
structure per file
• Allocate an index disk block for each file called
an i-node
• Directory entries now point to the i-node for
that file
• Maintain free list as bit vector
11.2.5 Indexed Allocation
Data blocks
Directory
i-node for /foo
30
File name
i-node address
100
100
201
/foo
/bar
201
30
i-node for /bar
50
50
99
99
11.2.5 Indexed Allocation
• Problem is that the one index file has to point
to every possible size file.
• Since the i-node is a fixed size there is a
maximum file size
11.2.6 Multilevel Indexed Allocation
• Make the i-node point to index blocks which
point to the files (first-level indirection)
• This concept may be extended to two-level
(and beyond) indirection
• Problem: Accessing even a small file requires a
lot of indirection
11.2.6 Multilevel Indexed Allocation
Data blocks
Directory
100
1st level
i-node for /foo
File name
i-node address
30
40
40
201
45
/foo
100
201
30
45
299
299
11.2.7 Hybrid Indexed Allocation
• Combine the previous two concepts
– Two direct pointers for small files
– One single indirect
– One double indirect
– One triple indirect
11.2.7 Hybrid Indexed Allocation
11.2.7 Hybrid Indexed Allocation
Data blocks
100
i-node for /foo
30
201
direct (100)
direct (201)
single indirect (40)
150
double indirect (45)
triple indirect
File
name
40
i-node
address
/foo
60
30
45
60
70
150
160
299
399
160
299
70
399
Directory
11.2.7 Hybrid Indexed Allocation
• Given the following:
– Size of index block = 512 bytes
– Size of Data block = 2048 bytes
– Size of pointer
= 8 bytes (to index or data blocks)
a) What is the maximum size (in bytes) of a file
that can be stored in this file system?
b) How many data blocks are needed for storing a
data file of 266 KB?
c) How many index blocks are needed for storing a
data file of size 266 KB?
11.2.8 Comparison of allocation
strategies
Allocation
Strategy
Free list
maintenance
Sequential
Access
Random
Access
File growth
Allocation
Overhead
Space
Efficiency
complex
Very good
Very good
messy
Medium to
high
Internal and
external
fragmentation
Contiguous Contiguous
blocks for
With
small files
Overflow
complex
Very good for
small files
Very good for
small files
OK
Medium to
high
Internal and
external
fragmentation
Linked List Non-
Bit vector
Good but
dependent on
seek time
Good but
dependent on
seek time
Good but
dependent on
seek time
Good but
dependent on
seek time
Good but
dependent on
seek time
Not good
Very good
Small to
medium
Excellent
Good but
dependent on
seek time
Good but
dependent on
seek time
Good but
dependent on
seek time
Good but
dependent on
seek time
Very good
Small
Excellent
limited
Small
Excellent
Good
Small
Excellent
Good
Small
Excellent
Contiguous
FAT
Indexed
Multilevel
Indexed
Hybrid
File
representatio
n
Contiguous
blocks
contiguous
blocks
Noncontiguous
blocks
Noncontiguous
blocks
Noncontiguous
blocks
Noncontiguous
blocks
FAT
Bit vector
Bit vector
Bit vector
11.3 Putting it all together
• UNIX uses a hybrid allocation approach with
hierarchical naming i.e. no central directory
• Each part of the file name corresponds to an inode which form part of a tree like structure
where all but the leaf nodes are directory files
(which are i-nodes)
• Each directory entry contains a type which
indicates if it is a directory or a data file
11.3 Putting it all together
Data blocks not shown
11.3 Putting it all together
11.3 Putting it all together
11.3 Putting it all together
• Given
– Current directory
/tmp
– I-node for /tmp
20
– The following Unix commands are executed in the current
directory:
touch foo
ln foo bar
ln –s /tmp/foo baz
ln baz gag
• Note: Type of i-node can be one of directory-file, data-file,
sym-link
• Show the file structure
11.3.1 i-node
• Unix files have a unique number known as the inode number
• Each file on a disk in represented by an i-node
structure that occupies an entire disk block
• The i-node number is just the address of the
block for that file
• The file system reserves enough blocks in a
contiguous group
• There is also a bit-vector which indicates which inodes are in use. Possibly same for free blocks
11.4 Components of the File System
11.4.1 Anatomy of creating and writing
files
• Program makes an I/O call to create a file on hard disk
– API routine for creating a file validates the call by checking
the permissions, access rights, and other related
information for call. After such validation, it calls the name
resolver.
– Name resolver contacts storage allocation module to
allocate an i-node for new file.
– Storage allocation module gets a disk block from free list
and returns it to name resolver. Storage allocation module
will fill in i-node commensurate with allocation scheme.
– Name resolver creates a directory entry and records name
to i-node mapping information for new file in directory.
11.4.1 Anatomy of creating and writing
files
• Program writes to the file just created.
– API routine for file write will validate the request.
– Name resolver passes memory buffer to storage allocation
module along with i-node information for file.
– Storage allocation module allocates data blocks from free list
commensurate with size of file write. It then creates a request
for disk write and hands request to device driver.
– Device driver adds request to its request queue. In concert
with the disk-scheduling algorithm, device driver completes
write of file to disk.
– Upon completion of file write, device driver gets an interrupt
from disk controller that is passed back up to file system, which
in turn communicates with CPU scheduler to continue execution
of your program from point of file write.
11.5 Interaction among the various
subsystems
11.5 Interaction among the various
subsystems
11.5 Interaction among the various
subsystems
11.5 Interaction among the various
subsystems
11.6 Layout of the file system on the
physical media
11.6 Layout of the file system on the
physical media
Partition Start address
{platter, track,
sector}
End address
{platter, track,
sector}
OS
1
{1, 10, 0}
{1, 600, 0}
Linux
2
{1, 601, 0}
{1, 2000, 0}
MS Vista
3
{1, 2001, 0}
{1, 5000, 0}
None
4
{2, 10, 0}
{2, 2000, 0}
None
5
{2, 2001, 0}
{2, 3000, 0}
None
11.6 Layout of the file system on the
physical media
11.6.1 In memory data structures
• Typically for performance reasons critical data
structures are kept in memory
• Eventually they will be written back to disk
• Why?
– They may be a short lifetime file
– Convenience and efficiency
• Some risk exists especially with removable
media
11.7 Dealing with System Crashes
• Systems sometimes crash due to bugs,
deadlocks or even power failure
• File system is critical thus os takes care to keep
file system healthy
• Upon failure system will try and write a crash
image
• Upon boot system checks for evidence of
crash image and checks for file system
consistency (On UNIX fsck)
11.8 File systems for other physical
media
• File system for a CD-ROM and CD-R
– Files can never be erased in such media, which significantly
reduces the complexity of the file system.
– CD-ROM: No question of a free list
– CD-R : All the free space is at the end of the CD where the
media may be appended with new files.
– CD-RW (rewritable CD) is more complex in that space of deleted
files needs to be added to free space on media.
– DVD: Similar
• Solid State Drives
– Seek time to disk blocks – a primary concern in disk-based file
systems – is less of a concern in SSD-based file systems.
– Allocation strategies (e.g., there is no need to ensure that data
blocks of file map to contiguous blocks on drive).
11.9 A summary of modern file
systems
• Linux
• Windows
11.9.1 Linux
• File system API still the same as UNIX
• Numerous internal changes to accommodate
things like multiple file system partitions,
longer file names, larger files, and hide
distinction between files present on local
media vs. network.
• Virtual File System (VFS): A system to allow
multiple file systems to be used in a way that
is transparent to the user
11.9.1.1 ext2
• Linux started with Minix file system
• Quickly moved to ext (extended file system)
• This was enhanced and improved and is now
ext2
• The layout of a ext2 file partition is close to
what we have already described
• Some useful things to mention…
11.9.1.1 ext2
11.9.1.1 ext2
11.9.1.2 Journaling File Systems
• Overhead of writing small quantities of data is
high
• System may crash before changes in memory
resident disk data structures have been
committed to physical disk
• For these reasons Linux uses a journaled file
system
11.9.1.2 Journaling File Systems
• Instead of actually performing each operation
on the actual disk a record is kept of each
transaction. These records may be in a simple
sequential data structure
11.9.1.2 Journaling File Systems
• Journal data structure is a finite size (e.g.
1Mb).
• Once data structure is full it is written to disk
and logs are deleted
• Journaling fixes the small-write problem
• When a system crashes it may be possible to
write information about journals which will
allow recovery
11.9.2 Microsoft Windows
• FAT-16: 2 GB per partition (still used for small
removable media)
• FAT-32: 2 TB per partition (still used for
interoperability with 95/98)
• NTFS: (64 bit addresses) Fundamental unit of
structuring is the Volume
11.9.2 Microsoft Windows
• View of a File
– UNIX: Stream of bytes
– NTFS: Object composed of typed attributes
• Attributes
– May be created or deleted at will
– Examples
•
•
•
•
•
name
creation date
raw image
thumbnail
etc.
11.9.2 Microsoft Windows
• NTFS
– File names up to 255 Unicode characters
– / replaced by \
– Aliasing through hard and soft links is a recent
addition
– On the fly compression and decompression
– Optional encryption feature
– Journaling
11.9.2 Microsoft Windows
• Similar to i-node: Master File Table
• MFT contains
–
–
–
–
–
File name
Timestamp
Security information
Data or pointers to disk blocks containing data
Optional pointer to another MFT
• File system tries to maximize use of contiguous
blocks
11.10 Summary
• Attributes associated with a file
• Allocation strategies and associated data structure for
storage management on disk
• Meta-data managed by file system
• Implementation details of a file system
• Interaction among various subsystems of operating system
• Layout of files on disk
• Data structures of file system and their efficient
management
• Dealing with system crashes
• File system for other physical media
• Examples of modern file systems
Questions?
Download