CENG334
Introduction to Operating Systems
Disks and Filesystems
Topics:
Disks
Erol Sahin
Dept of Computer Eng.
Middle East Technical University
Ankara, TURKEY
URL: http://kovan.ceng.metu.edu.tr/ceng334
© 2006 Matt Welsh – Harvard University
1
Today: Disks and Filesystems
Physical operation of modern disk drives
Operating system access to raw disk and disk I/O scheduling
Overview of filesystem design
Next lecture: Detailed look at filesystem implementation
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
2
A Disk Primer
Disks consist of one or more platters divided into tracks



Each platter may have one or two heads that perform read/write operations
Each track consists of multiple sectors
The set of sectors across all platters is a cylinder
Platter
Aperture
Sector
Heads
Track
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
3
Hard Disk Evolution
IBM 305 RAMAC (1956)


First commercially produced hard drive
5 Mbyte capacity, 50 platters each 24”
in diameter!
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
4
Hard Disk Evolution
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
5
Disk access time-1
Command overhead:
 Time to issue I/O, get the HDD to start responding, select
appropriate head
Seek time:
 Time to move disk arm to the appropriate track
 Depends on how fast you can physically move the disk arm
 These times are not improving rapidly
Settle time:
 Time for head position to stabilize on the selected track
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
6
Disk access time-2
Rotational latency:
 Time for the appropriate sector to move under the disk arm
 Depends on the rotation speed of the disk (e.g., 7200 RPM)
Transfer time
 Time to transfer a sector to/from the disk controller
 Depends on density of bits on disk and RPM of disk rotation
 Faster for tracks near the outer edge of the disk – why?
 Modern drives have more sectors on the outer tracks!
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
7
Example disk characteristics











Seagate Barracuda 7200.10 320GB Hard Drive
Capacity (GB):
320
Interface:
Serial ATA-300
Spindle Speed (RPM):
7200
Buffer Memory: 16MB
Average Latency (msec): 4.16
Maximum External Transfer Rate (Mbits/sec):
300
Data Transfer Rate on Serial ATA: Up to 3000 Mb/sec
Logical Cylinders/Heads/Sectors per Track: 16,383/16/63
Bytes Per Sector: 512
form factor: 3.5”
Disk interface speeds




SCSI: From 5 MB/sec to 320 MB/sec
ATA: from 33 MB/sec to 100 MB/sec
Serial ATA (single wire): Starting at 150 MB/sec
Firewire: 50 MB/sec
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
8
Disk I/O Scheduling
Given multiple outstanding I/O requests, what order to issue them?
Why does it matter?
Major goals of disk scheduling:
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
9
Disk I/O Scheduling
Given multiple outstanding I/O requests, what order to issue them?
Why does it matter?
Major goals of disk scheduling:
1) Minimize latency for small transfers

Primarily: Avoid long seeks by ordering accesses according to disk head locality
2) Maximize throughput for large transfers

Large databases and scientific workloads often involve enormous files and datasets
Note that disk block layout also has a large impact on performance


Where we place file blocks, directories, file system metadata, etc.
This will be covered in future lectures
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
10
Disk I/O Scheduling
Given multiple outstanding I/O requests, what order to issue them?
FIFO: Just schedule each I/O in the order it arrives

What's wrong with this?
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
11
Disk I/O Scheduling
Given multiple outstanding I/O requests, what order to issue them?
FIFO: Just schedule each I/O in the order it arrives

What's wrong with this? Potentially lots of seek time!
SSTF: Shortest seek time first


Issue I/O with the nearest cylinder to the current one
Why might this not work so well???
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
12
Disk I/O Scheduling
Given multiple outstanding I/O requests, what order to issue them?
FIFO: Just schedule each I/O in the order it arrives

What's wrong with this? Potentially lots of seek time!
SSTF: Shortest seek time first


Issue I/O with the nearest cylinder to the current one
Favors middle tracks: Head rarely moves to edges of disk
SCAN (or Elevator) Algorithm:



Head has a current direction and current cylinder
Sort I/Os according to the track # in the current direction of the head
If no more I/Os in the current direction, reverse direction
CSCAN Algorithm:


Always move in one direction, “wrap around” to beginning of disk when
moving off the end
Idea: Reduce variance in seek times, avoid discriminating against the highest and
lowest tracks
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
13
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
14
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
15
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
16
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
17
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
18
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
19
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
20
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
21
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
22
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
23
SCAN example
Current track
Direction
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
24
SCAN example
Current track
Direction
What is the overhead of the SCAN algorithm?




Count the total amount of seek time to service all I/O requests
In this case, 12 tracks in --> direction
15 tracks for long seek back
5 tracks in <-- direction
 Total: 12+15+5 = 32 tracks
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
25
ATA and IDE Interfaces
IDE stands for Integrated (or “Intelligent”) Drive Electronics



Same as “ATA” (Advanced Technology Attachment)
Standard interface to hard drives that integrate a drive controller in the drive itself
1 or 2 drives on a chain
Enhanced IDE (EIDE) and ATA-2

Faster version of ATA/IDE that supports Direct Memory Access (DMA) transfers
Ultra ATA: Speed enhancements to ATA standard

Versions running at 33, 66, and 100 Mbytes/sec
Serial ATA: Emerging standard using a serial (not parallel) interface


Speeds starting at 150 Mbyte/sec
Can drive longer cables at much higher clock speeds than parallel cable
Rounded
parallel ATA
Serial ATA
Parallel ATA
© 2006 Matt Welsh – Harvard University
26
SCSI Interface
Standard hardware interface to wide range of I/O devices


Disks, CDs, DVDs, tapes, etc.
Bus-based design: single shared set of I/O lines that all devices connect to
Access model using logical blocks on disk

On-disk controller maps logical block # to sector/track/head combination
SCSI-1: 8-bit bus, 5 Mhz = 5 Mbytes/sec max speed


Supported up to 8 devices on a single bus
Lots of problems with termination: required physical connector on end of cable to avoid
signal refraction!
SCSI-2: The next generation



Fast SCSI: 10 Mhz clock speed
Wide SCSI: 16 bit bus width
Fast wide SCSI: 10 Mhz + 16 bit bus = 20 MB/sec throughput
SCSI-3: Ramping up on speed and bus width

Highest speed now is “Ultra320 SCSI”: 160 Mhz x 16 bits = 320 MB/sec max speed
© 2006 Matt Welsh – Harvard University
27
Relative Interconnect Speeds
(from macspeedzone.com)
© 2006 Matt Welsh – Harvard University
28
Filesystems
A filesystem provides a high-level application access to disk
As well as CD, DVD, tape, floppy, etc...
Masks the details of low-level sector-based I/O operations
Provides structured access to data (files and directories)
Caches recently-accessed data in memory




Hierarchical filesystems

Organized as a tree of directories and files
Byte-oriented vs. record-oriented files


UNIX, Windows, etc. all provide byte-oriented file access
 May read and write files a byte at a time
Many older OS's provided only record-oriented files
 File composed of a set of records; may only read and write a record at a time
Versioning filesystems


Keep track of older versions of files
e.g., VMS filesystem: Could refer to specific file versions:foo.txt;1, foo.txt;2
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
29
Filesystem Operations
Filesystems provide a standard interface to files and directories:






Create a file or directory
Delete a file or directory
Open a file or directory – allows subsequent access
Read, write, append to file contents
Add or remove directory entries
Close a file or directory – terminates access
What other features do filesystems provide?






Accounting and quotas – prevent your classmates from hogging the disks
Backup – some filesystems have a “$HOME/.backup” containing automatic snapshots
Indexing and search capabilities
File versioning
Encryption
Automatic compression of infrequently-used files
Should this functionality be part of the filesystem or built on top?

Classic OS community debate: Where is the best place to put functionality?
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
30
Filesystem Block Layout
Filesystem defines a block size (typically 4KB) for all I/O operations

Must be at least 1 sector
 Why ever use more than one sector?
Each file on disk has an associated inode



On-disk data structure defining location, access rights, etc. for a file
A directory is just a list of inodes (one for each file in the directory)
The root directory (“/”) is generally inode 0, and stored in a well-known location
 All other files and directories are accessed through the root directory
Simplest allocation policy: contiguous files




FS maintains a free block count (e.g., a bitmap), stored on disk and memory
Must allocate all blocks of a file in a single contiguous chunk
Growing a file may require relocating the whole thing to a new area of disk
Simple design but clearly not very efficient!
 Lots of external fragmentation!
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
31
Block Layout cont'd
Somewhat better policy: Linked blocks


Inode points to the first block of the file
Each block points to the next block in the file (just a linked list on disk)
 What are the advantages and disadvantages??
inode
Indexed files


Inode contains a list of block numbers containing the file
Array is allocated when the file is created
 What are the advantages and disadvantages??
inode
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
32
Multilevel Indexed Files
Inode contains a list of 10-15 direct blocks

First few blocks of file
Also contains a pointer to a single indirect, double indirect, and triple
indirect blocks

Allows file to grow to be incredibly large!!!
direct blocks
inode
single-indirect blocks
double-indirect blocks
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
33
Disks are messy and slow
Low-level interface for reading and writing sectors




Generally allow OS to read/write an entire sector at a time
No notion of “files” or “directories” -- just raw sectors
So, what do you do if you need to write a single byte to a file?
Disk may have numerous bad blocks – OS may need to mask this from filesystem
Access times are still very slow



Disk seek times are around 10 ms
 Although raw throughput has increased dramatically
Compare to several nanosec to access main memory
Requires careful scheduling of I/O requests
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
34
CENG334
Introduction to Operating Systems
Filesystem Interface
Topics:
Erol Sahin
Dept of Computer Eng.
Middle East Technical University
Ankara, TURKEY
URL: http://kovan.ceng.metu.edu.tr/ceng334
© 2006 Matt Welsh – Harvard University
35
Filesystems
A filesystem provides a high-level application access to disk
● As well as CD, DVD, tape, floppy, etc...
●Masks the details of low-level sector-based I/O operations
●Provides structured access to data (files and directories)
●Caches recently-accessed data in memory
Adapted
Matt
Welsh’s (Harvard University) slides.
© 2006 Matt
Welsh –from
Harvard
University
36
File Concept
File is a logical storage unit abstraction provided by the operating
system.
Files are mapped by the operating system onto physical devices
(disks, tapes, CDs, etc..)
From the user point of view, file is the only unit through which
data can be written onto storage devices.
The information in a file as well as the attributes of the file is
determined by its creator.


Data
 numeric
 character
 binary
Program
When a file is created, it becomes independent of the process,
the user and even the system that created it.
© 2006 Matt Welsh – Harvard University
37
File Attributes
Name – only information kept in human-readable form
Identifier – unique tag (number) identifies file within file system
Type – needed for systems that support different types
Location – pointer to file location on device
Size – current file size
Protection – controls who can do reading, writing, executing
Time, date, and user identification – data for protection,
security, and usage monitoring
Information about files are kept in the directory structure, which is
maintained on the disk
© 2006 Matt Welsh – Harvard University
38
File Operations
File is an abstract data type and the OS provides a number of
minimal operations on it.






Create
 Allocate space and then make an entry in the directory
Write
 Requires name of the file, and the information to be written
 Search the directory to find file’s location.
 Keep a write-pointer to the location in the file
 Update the pointer after each write
Read
 Requires name of the file, and the information to be read
 Search the directory to find file’s location.
 Keep a read-pointer to the location in the file
 Update the pointer after each read
Reposition within file
 Change the value of the file-position pointer
Delete
 Deallocate the space and remove the entry
Truncate
 Change the allocated space to zero, and deallocate its space
© 2006 Matt Welsh – Harvard University
File-position
pointer
39
Open Files
Most file operations require searching the directory for the entry
associated with the file. To avoid this constant search, most
systems require that file be “open”ed, before its use.


Open(Fi) – search the directory structure on disk for entry Fi, and move the
content of entry to memory
Close (Fi) – move the content of entry Fi in memory to directory structure on
disk
Several pieces of data are needed to manage open files:




File pointer: pointer to last read/write location, per process that has the file
open
File-open count: counter of number of times a file is open – to allow removal of
data from open-file table when last processes closes it
Disk location of the file: cache of data access information
Access rights: per-process access mode information
© 2006 Matt Welsh – Harvard University
40
File Types – Name, Extension
The filetype provides information
on what can be done with that
file to the OS.
Typically implemented as the
extension of the filename.
In UNIX systems, a crude “magic
number” is stored at the
beginning of some files to
indicate the type of the file
(executable/shell script..)
In Mac OS X, each file has a type
TEXT/APPL. Each file also has
a creator attribute that is set to
the program that created it.
© 2006 Matt Welsh – Harvard University
41
File Structure
None - sequence of words, bytes

This is the structure supported by UNIX systems
Simple record structure



Lines
Fixed length
Variable length
Complex Structures


Formatted document
Relocatable load file
Can simulate last two with first method by inserting appropriate
control characters
Who decides:


Operating system
Program
© 2006 Matt Welsh – Harvard University
42
Open File Locking
Provided by some operating systems and file systems
Mediates access to a file
Mandatory or advisory:


Mandatory – access is denied depending on locks held and requested
Advisory – processes can find status of locks and decide what to do
© 2006 Matt Welsh – Harvard University
43
Access Methods
Sequential Access
read next
write next
reset
no read after last write
(rewrite)
Direct Access (available when the file is
made up of fixed-length logical
records, useful in databases)
read n
write n
position to n
read next
write next
rewrite n
n = relative block number
© 2006 Matt Welsh – Harvard University
44
Directory Structure
A collection of nodes containing information about all files
Directory
Files
F1
F2
F3
F4
Fn
Both the directory structure and the files reside on disk
Backups of these two structures are kept on tapes
© 2006 Matt Welsh – Harvard University
45
A Typical File-system Organization
Partition/volume/minidisk: A chunk of storage that holds a
filesystem. It contains information about the files in a directory.
© 2006 Matt Welsh – Harvard University
46
Operations Performed on Directory
A directory is effectively a symbol table that translates file names
into their directory entries.






Search for a file
 Given a name or a pattern of names, we should be able to find all the files
that use it.
Create a file
 touch assignment3.c
Delete a file
 rm assignment3.c
List a directory
 ls
Rename a file
 mv assignment3.c odev3.c
Traverse the file system
 cd include
© 2006 Matt Welsh – Harvard University
47
Organize the Directory (Logically) to Obtain
Efficiency – locating a file quickly
Naming – convenient to users


Two users can have same name for different files
The same file can have several different names
Grouping – logical grouping of files by properties, (e.g., all
Java programs, all games, …)
© 2006 Matt Welsh – Harvard University
48
Single-Level Directory
A single directory for all users
Naming problem:
Who will use the name assignment3.c?
Each student has to use a different name: e123456assignment3.c
Grouping problem
Listing would be very crowdy.
© 2006 Matt Welsh – Harvard University
49
Two-Level Directory
Separate directory for each user

Path name

In MS-DOS, C:\userx\test.bat

In VMS, volume:[userx.home]test.bat;1

Can have the same file name for different user

Efficient searching

No grouping capability
© 2006 Matt Welsh – Harvard University
50
Tree-Structured Directories
A directory is simply another file that needs to be treated in a
special way.
© 2006 Matt Welsh – Harvard University
51
Tree-Structured Directories (Cont)
Efficient searching
directory entry sizes would be manageable
Grouping Capability
Current directory (working directory)


cd /spell/mail/prog
type list
© 2006 Matt Welsh – Harvard University
52
Tree-Structured Directories (Cont)
Absolute or relative path name
Creating a new file is done in current directory
Delete a file
rm <file-name>
Creating a new subdirectory is done in current directory
mkdir <dir-name>
Example: if in current directory /mail
mkdir count
mail
prog
copy prt exp count
Deleting “mail”  deleting the entire subtree rooted by “mail”
© 2006 Matt Welsh – Harvard University
53
Acyclic-Graph Directories
Have shared subdirectories and files
© 2006 Matt Welsh – Harvard University
54
Acyclic-Graph Directories (Cont.)
Two different names (aliasing)
If dict deletes list  dangling pointer
Solutions:



Backpointers, so we can delete all pointers
Variable size records a problem
Backpointers using a daisy chain organization
Entry-hold-count solution
New directory entry type


Link – another name (pointer) to an existing file
Resolve the link – follow pointer to locate the file
© 2006 Matt Welsh – Harvard University
55
General Graph Directory
© 2006 Matt Welsh – Harvard University
56
General Graph Directory (Cont.)
How do we guarantee no cycles?



Allow only links to file not subdirectories
Garbage collection
Every time a new link is added use a cycle detection
algorithm to determine whether it is OK
© 2006 Matt Welsh – Harvard University
57
File System Mounting
A file system must be mounted before it can be accessed
A unmounted file system is mounted at a mount point
© 2006 Matt Welsh – Harvard University
58
File Sharing
Sharing of files on multi-user systems is desirable
Sharing may be done through a protection scheme
On distributed systems, files may be shared across a network
Network File System (NFS) is a common distributed file-sharing
method
© 2006 Matt Welsh – Harvard University
59
File Sharing – Multiple Users
User IDs identify users, allowing permissions and protections
to be per-user
Group IDs allow users to be in groups, permitting group
access rights
© 2006 Matt Welsh – Harvard University
60
Protection
File owner/creator should be able to control:


what can be done
by whom
Types of access






Read
Write
Execute
Append
Delete
List
© 2006 Matt Welsh – Harvard University
61
Access Lists and Groups
Mode of access: read, write, execute
Three classes of users
RWX
a) owner access
RWX
b) group access
RWX
c) public access
7
 111
6
 110
1
 001
Ask manager to create a group (unique name), say G, and add some users
to the group.
For a particular file (say game) or subdirectory, define an appropriate
access.
owner
chmod
group
761
public
game
Attach a group to a file
chgrp
© 2006 Matt Welsh – Harvard University
G
game
62
Windows XP Access-control List Management
© 2006 Matt Welsh – Harvard University
63
A Sample UNIX Directory Listing
© 2006 Matt Welsh – Harvard University
64
File Sharing – Remote File Systems
Uses networking to allow file system access between
systems



Manually via programs like FTP
Automatically, seamlessly using distributed file systems
Semi automatically via the world wide web
Client-server model allows clients to mount remote file
systems from servers





Server can serve multiple clients
Client and user-on-client identification is insecure or complicated
NFS is standard UNIX client-server file sharing protocol
CIFS is standard Windows protocol
Standard operating system file calls are translated into remote calls
Distributed Information Systems (distributed naming services)
such as LDAP, DNS, NIS, Active Directory implement
unified access to information needed for remote
computing
© 2006 Matt Welsh – Harvard University
65
File Sharing – Failure Modes
Remote file systems add new failure modes, due to network
failure, server failure
Recovery from failure can involve state information about status
of each remote request
Stateless protocols such as NFS include all information in each
request, allowing easy recovery but less security
© 2006 Matt Welsh – Harvard University
66
File Sharing – Consistency Semantics
Consistency semantics specify how multiple users are to
access a shared file simultaneously




Similar to Ch 7 process synchronization algorithms
 Tend to be less complex due to disk I/O and network latency (for remote file
systems
Andrew File System (AFS) implemented complex remote file sharing semantics
Unix file system (UFS) implements:
 Writes to an open file visible immediately to other users of the same open
file
 Sharing file pointer to allow multiple users to read and write concurrently
AFS has session semantics
 Writes only visible to sessions starting after the file is closed
© 2006 Matt Welsh – Harvard University
67
MODERN OPERATING SYSTEMS
Third Edition
ANDREW S. TANENBAUM
Chapter 4
File Systems
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
68
File Systems (1)
Essential requirements for long-term
information storage:
•
•
•
It must be possible to store a very large amount of
information.
The information must survive the termination of the
process using it.
Multiple processes must be able to access the
information concurrently.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
© 2006 Matt Welsh – Harvard University
69
File Systems (2)
Think of a disk as a linear sequence of fixed-size
blocks and supporting reading and writing of
blocks. Questions that quickly arise:
•
•
•
How do you find information?
How do you keep one user from reading another’s data?
How do you know which blocks are free?
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
70
File Naming
Figure 4-1. Some typical file extensions.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
71
File Structure
Figure 4-2. Three kinds of files. (a) Byte sequence.
(b) Record sequence. (c) Tree.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
72
File Types
Figure 4-3. (a) An executable file. (b) An archive.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
73
File Attributes
Figure 4-4a. Some possible file attributes.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
74
File Operations
The most common system calls relating to files:
•
•
•
•
•
•
Create
Delete
Open
Close
Read
Write
•
•
•
•
•
Append
Seek
Get Attributes
Set Attributes
Rename
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
75
Example Program Using File System Calls (1)
...
Figure 4-5. A simple program to copy a file.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
76
Example Program Using File System Calls (2)
Figure 4-5. A simple program to copy a file.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
77
Hierarchical Directory Systems (1)
Figure 4-6. A single-level directory system containing four files.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
78
Hierarchical Directory Systems (2)
Figure 4-7. A hierarchical directory system.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
79
Path Names
Figure 4-8. A UNIX directory tree.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
80
Directory Operations
System calls for managing directories:
•
•
•
•
Create
Delete
Opendir
Closedir
•
•
•
•
Readdir
Rename
Link
Uplink
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
81
File System Layout
Figure 4-9. A possible file system layout.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
82
Contiguous Allocation
Figure 4-10. (a) Contiguous allocation of disk space for 7 files.
(b) The state of the disk after files D and F have been removed.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
83
Linked List Allocation
Figure 4-11. Storing a file as a linked list of disk blocks.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
84
Linked List Allocation Using a Table in Memory
Figure 4-12. Linked list allocation using a file allocation table in
main memory.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
85
I-nodes
Figure 4-13. An example i-node.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
86
Implementing Directories (1)
Figure 4-14. (a) A simple directory containing fixed-size entries with
the disk addresses and attributes in the directory entry. (b) A
directory in which each entry just refers to an i-node.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
87
Implementing Directories (2)
Figure 4-15. Two ways of handling long file names in a directory.
(a) In-line. (b) In a heap.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
88
Shared Files (1)
Figure 4-16. File system containing a shared file.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
89
Shared Files (2)
Figure 4-17. (a) Situation prior to linking. (b) After the link is
created. (c) After the original owner removes the file.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
90
Journaling File Systems
Operations required to remove a file in UNIX:
•
•
•
Remove the file from its directory.
Release the i-node to the pool of free i-nodes.
Return all the disk blocks to the pool of free disk blocks.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
91
Virtual File Systems (1)
Figure 4-18. Position of the virtual file system.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
92
Virtual File Systems (2)
Figure 4-19. A simplified view of the data structures and code
used by the VFS and concrete file system to do a read.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
93
Disk Space Management Block Size (1)
Figure 4-20. Percentage of files smaller than a given size
(in bytes).
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
94
Disk Space Management Block Size (2)
Figure 4-21. The solid curve (left-hand scale) gives the data rate of
a disk. The dashed curve (right-hand scale) gives the disk
space efficiency. All files are 4 KB.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
95
Keeping Track of Free Blocks (1)
Figure 4-22. (a) Storing the free list on a linked list. (b) A bitmap.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
96
Keeping Track of Free Blocks (2)
Figure 4-23. (a) An almost-full block of pointers to free disk blocks
in memory and three blocks of pointers on disk. (b) Result of
freeing a three-block file. (c) An alternative strategy for
handling the three free blocks. The shaded entries represent
pointers to free disk blocks.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
97
Disk Quotas
Figure 4-24. Quotas are kept track of on a per-user basis
in a quota table.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
98
File System Backups (1)
Backups to tape are generally made to handle
one of two potential problems:
•
•
Recover from disaster.
Recover from stupidity.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
99
File System Backups (2)
Figure 4-25. A file system to be dumped. Squares are directories,
circles are files. Shaded items have been modified since last
dump. Each directory and file is labeled by its i-node number.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
100
File System Backups (3)
Figure 4-26. Bitmaps used by the logical dumping algorithm.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
101
File System Consistency
Figure 4-27. File system states. (a) Consistent. (b) Missing block.
(c) Duplicate block in free list. (d) Duplicate data block.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
102
Caching (1)
Figure 4-28. The buffer cache data structures.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
103
Caching (2)
•
Some blocks, such as i-node blocks, are rarely
referenced two times within a short interval.
•
Consider a modified LRU scheme, taking two
factors into account:
•Is the block likely to be needed again soon?
•Is the block essential to the consistency of the file system?
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
104
Reducing Disk Arm Motion
Figure 4-29. (a) I-nodes placed at the start of the disk.
(b) Disk divided into cylinder groups, each with its own blocks
and i-nodes.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
105
The ISO 9660 File System
Figure 4-30. The ISO 9660 directory entry.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
106
Rock Ridge Extensions
Rock Ridge extension fields:
•
•
•
•
•
•
•
•
PX - POSIX attributes.
PN - Major and minor device numbers.
SL - Symbolic link.
NM - Alternative name.
CL - Child location.
PL - Parent location.
RE - Relocation.
TF - Time stamps.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
107
Joliet Extensions
Joliet extension fields:
•
•
•
•
Long file names.
Unicode character set.
Directory nesting deeper than eight levels.
Directory names with extensions
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
108
The MS-DOS File System (1)
Figure 4-31. The MS-DOS directory entry.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
109
The MS-DOS File System (2)
Figure 4-32. Maximum partition size for different block sizes. The
empty boxes represent forbidden combinations.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
110
The UNIX V7 File System (1)
Figure 4-33. A UNIX V7 directory entry.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
111
The UNIX V7 File System (2)
Figure 4-34. A UNIX i-node.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
112
The UNIX V7 File System (3)
Figure 4-35. The steps in looking up /usr/ast/mbox.
© 2006 Matt Welsh – Harvard
University
Tanenbaum,
Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
113