CS471-11/14

advertisement
CS 471 - Lecture 8
File Systems
Ch. 10,11
George Mason University
Fall 2009
File-System Interface





File Concept
File Operations
Access Methods
Directory Structure
Access control
GMU – CS 571
10.2
Files



A file is a named collection of related information that
is recorded on secondary storage
Several information storage media
(magnetic/optical disks)
The operating system provides a uniform logical
view of information storage
GMU – CS 571
10.3
Files

Files
• are mapped onto physical storage devices.
• represent programs (both source and object forms) and
•
•
•
•

data.
have a certain structure that may be considered as
sequence of bits, bytes, lines, records…
meaning defined by file’s creator
have attributes that are recorded by the O.S. (name,
size, type, location, protection info, time info, etc.)
logically contiguous
Information about files are kept in the directory
structure, which is also maintained on the
secondary storage.
GMU – CS 571
10.4
Basic File Operations





Create
Write
Read
Delete
Others
• reposition within the file, append, rename,
truncate, ...

For write/read operations, the operating system
needs to keep a file position pointer for each
process
• Need to update it dynamically and properly
GMU – CS 571
10.5
File Operations


To avoid searching the directory entries
repeatedly, many systems require that an open()
system call be issued before that file is first
used actively.
Operating System keeps
• a system-wide open-file table containing

information about all open files
• per-process open-file tables containing
information about all open files of each process
The open operation takes a file name and
searches the directory, copying the directory
entry into the open-file table. It returns a pointer
to the entry in the open file table.
GMU – CS 571
10.6
File Operations


The per-process open table contains info about
•
•
•
•
Position pointer (current location within file)
Access rights
Accounting
Pointer to the system-wide open-file table entry
The system-wide open table includes info about
• File location on the disk
• File size
• File open count (the number of processes using this

file)
A process that completes its operations on a given
file will issue a close() system call.
GMU – CS 571
10.7
File Operations (Cont.)
Process A’s
Open-File
Table
.
.
.
.
.
.
.
.
.
.
.
.
Process B’s
Open-File
Table
GMU – CS 571
.
.
.
.
.
.
.
.
10.8
System-Wide
Open-File
Table
An Example Program Using File System Calls (1/3)
/* File copy program. Error checking and reporting is minimal. */
/* “myfilecopy oldfile newfile” will copy the contents of “oldfile” to
“newfile” */
/* The program will read blocks of 4K from the “oldfile” to a buffer, and
store them to “newfile” sequentially */
#include <sys/types.h> /* include necessary header files */
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char *argv[]); /* ANSI prototype */
#define BUF_SIZE 4096 /* use a buffer size of 4096 bytes */
#define OUTPUT_MODE 0700 /* protection bits for output file */
GMU – CS 571
10.9
An Example Program Using File System Calls (2/3)
int main(int argc, char *argv[])
{
int in_fd, out_fd, rd_count, wt_count;
char buffer[BUF_SIZE];
if (argc != 3) exit(1); /* error if argc is not 3 */
/* Open the input file and create the output file */
in_fd = open(argv[1], O_RDONLY); /* open the source file */
if (in_fd < 0) exit(2); /* if it cannot be opened, exit */
out_fd = creat(argv[2], OUTPUT_MODE); /* create the destination file */
if (out_fd < 0) exit(3); /* if it cannot be created, exit */
GMU – CS 571
10.10
An Example Program Using File System Calls (3/3)
/* Copy loop */
while (TRUE) {
rd_count = read(in_fd, buffer, BUF_SIZE); /* read a block of data */
if (rd_count <= 0) break; /* if end of file or error, exit loop */
wt_count = write(out _fd, buffer, rd_count); /* write data */
if (wt_count <= 0) exit(4); /* wt_count <= 0 is an error */
}
/* Close the files */
close(in_fd);
close(out_fd);
if (rd_count == 0) /* no error on last read */
exit(0);
else
exit(5); /* error on last read */
}
GMU – CS 571
10.11
File Types


Most operating systems associate a type with a
file
File type can be used to operate on files in
reasonable ways
• ex: Windows – file type (i.e. suffix) used to
determine what program to open a file with
• ex: Unix – info stored in file (‘magic number’) can
be used for differentiation – suffix not always
used
GMU – CS 571
10.12
File Types – Name, Extension
GMU – CS 571
10.13
File Structure


None - sequence of words, bytes
Simple record structure

Complex Structures


• Lines
• Fixed length
• Variable length
• Formatted document
• Relocatable load file
Can simulate last two methods with first method
by inserting appropriate control characters
Who decides:
GMU – CS 571
• Operating system
• Program
10.14
Internal File Structure






Disk systems have a well-defined
block size determined by the size of
a sector.
All disk I/O is performed in units of
one block (physical record).
• Each block is one or more sectors
• A sector can hold 32 – 4096 bytes
Files are made of logical records.
Often, a number of logical records
will be packed into physical records.
Operating System will perform
translation from logical records to
physical records.
Internal fragmentation
GMU – CS 571
10.15
File Access Methods

Sequential Access
• Information is processed in order, one record after the
other (tape model)
• Example: editors and compilers
read next
write next
reset (rewind)
GMU – CS 571
10.16
File Access Methods




GMU – CS 571
Direct Access
•
The file is made up fixed-length logical records
that allow programs to read and write records
rapidly in any order
read n
write n
or alternatively:
position to n
read next
write next
n = relative block number
request to read block N translated into physical
address B*N + start (for block size B)
ex: database
Other access methods often built on top of direct
access
10.17
Directory Structure


The directory acts as a symbol table that
translates file names into their directory entries.
Operations on a directory
•
•
•
•
•
•
GMU – CS 571
Search for a file
Create a file
Delete a file
List a directory
Rename a file
…
10.18
Organize the Directory (Logically) to Obtain


Efficiency – locating a file quickly
Naming – convenient to users
• Two users can have same name for

different files
• The same file can have several different
names
Grouping – logical grouping of files by
properties, (e.g., all Java programs, all
games, …)
GMU – CS 571
10.19
Single-Level Directory

A single directory for all users
Naming problem
Grouping problem
GMU – CS 571
10.20
Two-Level Directory

Separate directory for each user
 Path name
 Can have the same file name for different user
 Efficient searching
 No grouping capability
GMU – CS 571
10.21
Tree Directory Structure

Tree-structured directories extend the structure to a tree of
arbitrary height
•
•
•
•
GMU – CS 571
User-imposed structure
Relative paths vs. absolute paths
Directory deletion policy
Concept of a ‘current directory’
10.22
Acyclic-Graph Directories


Allows shared subdirectories and files.
A shared file will “exist” in multiple directories
at once.
GMU – CS 571
10.23
Achieving File Sharing


Option 1: Duplicate all information about the
shared file in both directories (Problem?)
Option 2: Create a new directory entry called link
• The link is effectively a pointer to another file or
directory
• When the directory entry of a referred file is a link,
we resolve the link by using the path name
(symbolic link in Unix)
• “ln –s reports/report1.txt myreport”
GMU – CS 571
10.24
Achieving File Sharing (Cont.)

Option 3: Each entry in a directory can point to a
little data structure (File Control Block [FCB], or
“i-node”) that keeps information about the file
• The directory entries corresponding to a shared
file will all point to the same file control block
• Non-symbolic or “hard” links in Unix
• “ln reports/report1.txt myreport”
“root“ Directory
myreport
“reports”
Directory
report1.txt
GMU – CS 571
FCB of the file
10.25
Achieving File Sharing (Cont.)

What to do when a shared file is deleted by a user?
• The deletion of a link should not affect the original file
• If the original file is deleted, we may be left with

dangling pointers.
Solutions
• Using backpointers, delete also all links. The search
may be expensive.
• Alternatively, leave the links intact until an attempt is
made to use them (Unix symbolic links).
May lead to infrequent but subtle problems.
• In case of non-symbolic (or in Unix, “hard”) links:
Preserve the file until all references are deleted. Keep
the count of the number of the references, delete the
file when the count reaches zero.
GMU – CS 571
10.26
File Protection

File owner/creator should be able to control:
• what can be done
• by whom

Types of access
•
•
•
•
•
•
GMU – CS 571
Read
Write
Execute
Append
Delete
List
10.27
Access Lists and Groups


Mode of access: read, write, execute
Three classes of users
RWX
a) owner access
7

1
RWX
b) group access
6

1
RWX
c) public access
1

0
11
10


01
Ask manager to create a group (unique name), say G, and
add some users to the group.
For a particular file (say game) or subdirectory, define an
appropriate access.
GMU – CS 571
10.28
Windows XP Access-control List Management
GMU – CS 571
10.29
A Sample UNIX Directory Listing
GMU – CS 571
10.30
File System Implementation




File System Structure
File System Implementation
Allocation Methods
File System Performance
GMU – CS 571
10.31
File System Structure



An operating system may allow multiple file
systems.
Once the user interface is determined, the file
system must be implemented to map the logical
file system to the physical secondary-storage
devices.
File control block – storage structure that keeps
information about a given file (Unix “i-nodes”).
• Ownership, size, permissions, access date info,
location of data blocks
GMU – CS 571
10.32
Schematic View of Virtual File System
Operating System Concepts – 7th Edition, Jan 1, 2005
11.33
Silberschatz, Galvin and Gagne ©2005
Layered File System




GMU – CS 571
File system is organized into layers
Logical File System Layer manages
the file-system structure (through
directories and FCBs).
File-Organization Module performs
mapping between logical blocks and
physical blocks. It also includes freespace manager and block allocation
manager.
Basic File System Layer issues
generic commands to the appropriate
device driver (I/O Control Layer) to
read and write physical blocks on the
disk
10.34
Storage Structure

A disk is a physical
memory storage device
that can be used for:
• a single file system (in its
entirety)
• multiple file systems
• in part for file systems, in
part for other purposes
(e.g. for swap space or
unformatted (raw) disk
space)

These parts are known as
partitions, slices or
minidisks.
GMU – CS 571
10.35
Storage Structure (Cont.)


Each partition can be either “raw” (containing
no file system), or “cooked” (with a file system)
Raw disk
• contains a large sequential array of logical blocks,
without any file-system data
• can be used as swap space
• can be used for special (e.g. database)
applications
GMU – CS 571
10.36
Storage Structure (Cont.)


Each partition that contains a file system has
a device directory
The device directory keeps information (name,
location, size, type, owner) for files on that partition.
GMU – CS 571
10.37
Accessing Disk Sub-system


Disks allow direct
access to stored data
Disk access time has
two components
• Random access time
•

GMU – CS 571
10.38
(positioning) that
includes seek time
and rotational latency
(5-10 ms)
Transfer time (10
MB/s)
Compare to the
memory access time of
10-100 nanoseconds
Accessing Disk Sub-system


When a process needs I/O, it
issues a system call to the OS
•
•
•
•
At any point in time, the disk
may have several pending
requests that must be
scheduled:
•
•
•
•
GMU – CS 571
10.39
input or output
from what disk address
to what memory address
how many sectors
FCFS
SSTF (shortest seek time first)
SCAN
…
Implementation of “Open” and “Read”


GMU – CS 571
Figure (a) refers to opening a file.
Figure (b) refers to reading a file.
10.40
Allocation Methods

The allocation method refers to how disk blocks
are allocated for files:
• Contiguous allocation
• Linked allocation
• Indexed allocation
GMU – CS 571
10.41
Contiguous Allocation


Each file occupies a set of contiguous blocks on the
disk.
Simple – only starting location (block #) and length
(number of blocks) are required.
GMU – CS 571
10.42
Contiguous Allocation



Efficient access to multiple blocks of a file
Both sequential and direct access can be supported.
A major problem is determining how much space is
needed for a new file.

How to let files grow?

Finding space for a new file: First-fit and best-fit …

These algorithms suffer from external fragmentation:
free space is broken into multiple chunks.
GMU – CS 571
10.43
Extent-Based Systems



Many newer file systems (I.e. Veritas File
System) use a modified contiguous allocation
scheme
Extent-based file systems allocate disk blocks in
extents
An extent is a contiguous block of disks
• Extents are allocated for file allocation
• A file consists of one or more extents.
GMU – CS 571
10.44
Linked Allocation

Each file is a linked list of disk blocks: blocks
may be scattered anywhere on the disk.
block
GMU – CS 571
=
pointer
10.45
Linked Allocation



Each file is a linked list of disk blocks: blocks may be
scattered anywhere on the disk.
Each block contains a pointer to the next block.
Each directory entry has a pointer to the first and last
disk blocks of the file.
GMU – CS 571
10.46
Linked Allocation






External fragmentation is eliminated.
The size of a file does not need to be declared at
the time of creation.
However, it can be used effectively only for
sequential access files. Inefficient for directaccess files.
Another disadvantage is the space required for the
pointers.
One solution is to collect blocks into multiples
(clusters) and to allocate the clusters rather than
blocks.
Another problem of linked allocation is reliability:
what will happen if a pointer is lost or damaged?
GMU – CS 571
10.47
File-Allocation Table (FAT)




A variation of the linked
allocation method
A section of the disk at
the beginning of each
partition is used as the
File Allocation Table.
The table entries give the
block number of the next
block in the file.
The scheme can result in
a significant number of
disk head seeks, unless
the FAT is cached.
GMU – CS 571
10.48
Indexed Allocation



Indexed allocation supports direct access,
without suffering from external fragmentation
or size-declaration problems.
However, wasted space may be a problem.
How large the index block should be?
• To reduce the wasted space, we want to keep
•
•
•
•
GMU – CS 571
the index block small
If the index block is too small, it will not be
able to hold pointers for a large file.
Linked scheme
Multilevel scheme
Combined scheme
10.49
index table
Indexed Allocation – Mapping (Cont.)

outer-index
index table
GMU – CS 571
10.50
file
Combined Scheme (Unix)



GMU – CS 571
Keep the first N pointers of the index block in the
file’s i-node (FCB).
The first 12 of these pointers point to direct blocks
The next three pointers point to indirect blocks
10.51
File System Performance


Disk access is the bottleneck for the file system
performance
Caching
• Most disk controllers have an on-board cache that
can store entire tracks at a time
• Subsequent requests can be served through the
on-board cache

Most systems maintain a separate section of
main memory for a disk cache (block cache, or
buffer cache), where blocks are kept under the
assumption that they will be re-used in near
future
GMU – CS 571
10.52
Caching



A page cache caches
pages rather than disk
blocks using virtual
memory techniques
Memory-mapped I/O
uses a page cache
Routine I/O through the
file system uses the
buffer (disk) cache
GMU – CS 571
10.53
Memory-mapped I/O



Memory-mapped I/O uses the same address bus
to address both memory and I/O devices, and
the CPU instructions used to access the
memory are also used for accessing devices.
Port-mapped I/O uses a special class of CPU
instructions specifically for performing I/O.
A device's direct memory access (DMA) is a
memory-to-device communication method, that
bypasses the CPU.
GMU – CS 571
10.54
Unified Buffer Cache

A unified buffer cache uses the same page
cache to cache both memory-mapped pages and
ordinary file system I/O
GMU – CS 571
10.55
File System Performance (Cont.)


LRU is a reasonable block replacement policy
BUT: if a critical block (such as File Control Block, or
i-node) is read into the cache and modified, but not
re-written to the disk, a crash will leave the file
system in an inconsistent state.

Critical blocks must be written immediately.

Avoiding inconsistency
• Write through-cache: write every modified block to disk
as soon as it has been written

UNIX solution
• The system call sync forces all the modified blocks out
•
GMU – CS 571
onto the disk immediately.
A program, usually called update, is invoked in the
background to call sync every 30 seconds.
10.56
File System Performance


Block-read-ahead: When reading block k to the
cache in memory, read also block k+1
Reduce disk arm motion through
• Putting blocks that are likely to be accessed in
sequence close to each other
• Disk scheduling algorithms that serve pending
disk access requests in an order that reduces the
delay
GMU – CS 571
10.57
Distributed File Sharing


Sharing of files on multi-user systems is
desirable
On distributed systems, files may be shared
across a network
• Manually via programs like FTP
• Automatically, seamlessly using distributed file
systems
• Semi automatically via the world wide web

Network File System (NFS) is a common
distributed file-sharing method
GMU – CS 571
10.58
File Sharing – Remote File Systems

Client-server model allows clients to mount remote file
systems from servers
• Server can serve multiple clients
• Client and user-on-client identification is insecure or

complicated
• NFS is standard UNIX client-server file sharing protocol
• CIFS is standard Windows protocol
• Standard operating system file calls are translated into
remote calls
Distributed Information Systems (distributed naming
services) such as LDAP, DNS, NIS, Active Directory
implement unified access to information needed for
remote computing
GMU – CS 571
10.59
File Sharing – Failure Modes



Remote file systems add new failure modes,
due to network failure, server failure
Recovery from failure can involve state
information about status of each remote
request
Stateless protocols such as NFS include all
information in each request, allowing easy
recovery but less security
GMU – CS 571
10.60
File Sharing – Consistency Semantics

Consistency semantics specify how multiple
users are to access a shared file simultaneously
• Similar to process synchronization algorithms
 Tend to be less complex due to disk I/O and network
latency (for remote file systems)
• Andrew File System (AFS) implemented complex
remote file sharing semantics
• Unix file system (UFS) implements:
 Writes to an open file visible immediately to other
users of the same open file
 Sharing file pointer to allow multiple users to read
and write concurrently
• AFS has session semantics
 Writes only visible to sessions starting after the file
is closed
GMU – CS 571
10.61
The Sun Network File System (NFS)
 An implementation and a specification of a software system for
accessing remote files across LANs (or WANs)
 The implementation is part of the Solaris and SunOS operating
systems running on Sun workstations using an unreliable datagram
protocol (UDP/IP protocol and Ethernet)
Operating System Concepts – 7th Edition, Jan 1, 2005
11.62
Silberschatz, Galvin and Gagne ©2005
NFS (Cont.)
 Interconnected workstations viewed as a set of independent
machines with independent file systems, which allows sharing
among these file systems in a transparent manner
 A remote directory is mounted over a local file system directory
The mounted directory looks like an integral subtree of the
local file system, replacing the subtree descending from the
local directory
 Specification of the remote directory for the mount operation is
nontransparent; the host name of the remote directory has to
be provided
 Files in the remote directory can then be accessed in a
transparent manner
 Subject to access-rights accreditation, potentially any file
system (or directory within a file system), can be mounted
remotely on top of any local directory

Operating System Concepts – 7th Edition, Jan 1, 2005
11.63
Silberschatz, Galvin and Gagne ©2005
NFS (Cont.)
 NFS is designed to operate in a heterogeneous environment of
different machines, operating systems, and network architectures;
the NFS specifications independent of these media
 This independence is achieved through the use of RPC primitives
built on top of an External Data Representation (XDR) protocol
used between two implementation-independent interfaces
 The NFS specification distinguishes between the services provided
by a mount mechanism and the actual remote-file-access services
Operating System Concepts – 7th Edition, Jan 1, 2005
11.64
Silberschatz, Galvin and Gagne ©2005
Three Independent File Systems
Operating System Concepts – 7th Edition, Jan 1, 2005
11.65
Silberschatz, Galvin and Gagne ©2005
Mounting in NFS
Mounts - S1:/usr/shared
Cascading mounts - S2:/usr/dir2
Over U:/usr/local/
Over U:/usr/local/dir1
Operating System Concepts – 7th Edition, Jan 1, 2005
11.66
Silberschatz, Galvin and Gagne ©2005
NFS Mount Protocol
 Establishes initial logical connection between server and client
 Mount operation includes name of remote directory to be mounted and
name of server machine storing it

Mount request is mapped to corresponding RPC and forwarded to
mount server running on server machine

Export list – specifies local file systems that server exports for
mounting, along with names of machines that are permitted to
mount them
 Following a mount request that conforms to its export list, the server
returns a file handle—a key for further accesses
 File handle – a file-system identifier, and an inode number to identify
the mounted directory within the exported file system
 The mount operation changes only the user’s view and does not affect
the server side
Operating System Concepts – 7th Edition, Jan 1, 2005
11.67
Silberschatz, Galvin and Gagne ©2005
NFS Protocol
 Provides a set of remote procedure calls for remote file operations.
The procedures support the following operations:
 searching for a file within a directory

reading a set of directory entries
 manipulating links and directories
 accessing file attributes
 reading and writing files
 NFS servers are stateless; each request has to provide a full set of
arguments
(NFS V4 is just coming available – very different, stateful)
 Modified data must be committed to the server’s disk before results
are returned to the client (lose advantages of caching)
 The NFS protocol does not provide concurrency-control
mechanisms
Operating System Concepts – 7th Edition, Jan 1, 2005
11.68
Silberschatz, Galvin and Gagne ©2005
Three Major Layers of NFS Architecture
 UNIX file-system interface (based on the open, read, write, and
close calls, and file descriptors)
 Virtual File System (VFS) layer – distinguishes local files from
remote ones, and local files are further distinguished according to
their file-system types

The VFS activates file-system-specific operations to handle
local requests according to their file-system types

Calls the NFS protocol procedures for remote requests
 NFS service layer – bottom layer of the architecture

Implements the NFS protocol
Operating System Concepts – 7th Edition, Jan 1, 2005
11.69
Silberschatz, Galvin and Gagne ©2005
Schematic View of NFS Architecture
Operating System Concepts – 7th Edition, Jan 1, 2005
11.70
Silberschatz, Galvin and Gagne ©2005
NFS Path-Name Translation
 Performed by breaking the path into component names and
performing a separate NFS lookup call for every pair of component
name and directory vnode
 To make lookup faster, a directory name lookup cache on the
client’s side holds the vnodes for remote directory names
Operating System Concepts – 7th Edition, Jan 1, 2005
11.71
Silberschatz, Galvin and Gagne ©2005
NFS Remote Operations
 Nearly one-to-one correspondence between regular UNIX system
calls and the NFS protocol RPCs (except opening and closing files)
 NFS adheres to the remote-service paradigm, but employs
buffering and caching techniques for the sake of performance
 File-blocks cache – when a file is opened, the kernel checks with
the remote server whether to fetch or revalidate the cached
attributes

Cached file blocks are used only if the corresponding cached
attributes are up to date
 File-attribute cache – the attribute cache is updated whenever new
attributes arrive from the server
 Clients do not free delayed-write blocks until the server confirms
that the data have been written to disk
Operating System Concepts – 7th Edition, Jan 1, 2005
11.72
Silberschatz, Galvin and Gagne ©2005
Download