Week 7 Power Point Slides

advertisement
File Systems
A collection of directories and files
Many Operating Systems
support multiple, virtual, file
system (VFS) organizations
• A VFS is an abstraction, which
enables a single system call to
abstract the file system
organization details from the
developer
• The system call provides a middle
layer, which transfers to the
correct low-level object-oriented
interface
File Record Structure
File: collection of records, Record: collection of fields
• No Structure: A sequence of bytes
• Record structure: Lines of text, Fixed length, variable length
• Complex Record Structures
–
–
–
–
Formatted documents with appropriate control characters
Relocatable load files
Database table rows
Combination of binary fields
• Who decides the structure:
– Operating system
– Program
File Control Block (FCB)
OS data structure consisting of information about a file
•
•
•
•
•
•
•
Name –human-readable
Identifier – unique number identifies each file
Type – most systems support different types
Location – pointer to file location on device
Size – current file size
Protection – access rights and owner
Time, date, and user identification – data for
protection, security, and usage
• Where is file information maintained? On a
disk resident directory structure
File System
Abstraction of a ‘raw’ partition as
collections of files and directories
• Partition: Contains a file
system on disk, consists of:
– File control blocks (FCB):
Defines a file’s attributes
– Directory/Folder:
A collection of FCBs
– Boot Control Block:
OS load Information
– Partition Control Block:
Information about the partition
File System Operations
A file is an abstract data type with well-defined operations
•
•
•
•
•
Create
Write and Read
Reposition within file (Seek)
Delete or Truncate
Open – Load the file information from the
directory structure into memory
• Close – update the file information on disk
and release resources
Open File Information
• File pointer: pointer to last read/write location, per
process that has the file open
• File-open count: Allows removal from the openfile list on the last close
• Pointers: Disk location and a data access cache
• Access rights: per-process access mode information
• Locking Information: mediates access to a file
– Mandatory – access denied based on record locks
– Advisory – processes can inquire lock status
Java File Exclusive Lock
FileLock exclusive=null;
public static final boolean EXCLUSIVE=false;
try
{ RandomAccessFile raf =
new RandomAccessFile("file.txt","rw");
FileChannel ch = raf.getChannel();
// exclusively lock the first half of the file
exclusive = ch.lock(0,raf.length()/2,EXCLUSIVE);
/** Now modify the data . . . */
exclusive.release(); // release lock.
}
catch (Exception ioe)
{ System.out.println("I didn't like that");
}
Blocks till lock available, or InterruptedException, or
AsynchronousCloseExcception
Java File Shared Lock
FileLock shared=null;
public static final boolean SHARED=true;
try
{ RandomAccessFile raf =
new RandomAccessFile("file.txt","rw");
FileChannel ch = raf.getChannel();
// Shared lock on the top half
long len = raf.length();
shared = ch.lock(len/2+1,len, SHARED);
/** Now read the data . . . */
sharedLock.release(); // release lock.
}
catch (java.io.Exception ioe)
{ System.err.println("I didn't like that"); }
Direct and Sequential Access
• Sequential Access: read, write, append, reset,
rewrite (cannot read previously written records)
• Direct (Random) Access: seek, read, write
Indexed Access
File System Software Structure
• Virtual File System (VFS): wrapper
between applications and different file
systems
• Uniform application view
File Types
Files/Folders
Read/Write
Layered Approach
Directory Structure
Directory: A collection of nodes containing file information
Directory
F1
F2
F3
F4
Files
Fn
Typical File System Organization
Directory Design
Note: A directory is another abstract data type
• Operations: Search, Create, Delete, List, Rename, Traverse
• Design Criteria
–
–
–
–
Efficiency – locating a file quickly
Naming – convenient to users, aliases, unique full qualified path names
Grouping – by extension or properties
Access control
• Design decisions
– Should sub-directories be removed on a delete operation?
– What kind of path names should be allowed?
– Are absolute and relative paths supported?
Directory Structure
Goals:
(a)Convenient name space
(b)Quick to access and locate
(c)Ability to group related files
Definitions:
Path (absolute, relative)
working directory
Two level: Fails Goal c
Single Level: Fails Goal b and c
Tree Structured
Single and Two Level Directories
• Single level
• Disadvantages: Name conflicts, no sub-folders
• Can have the same names for different users
• Efficient searching but no sub-folders
Tree-Structured Directories
• Efficient searching, can group by sub-folders,
Working directory, absolute/relative path names
• Problem to resolve: How should links (aliases) work?
Acyclic-Graph Directories
Cycles can lead to infinite loops
• Problems sharing
directories and files
– Aliased names (link)
– Multiple link levels
– Dangling pointers
• Solutions
– A linked list of back
pointers
– Lazy detection
– Follow link chains
– Remove data when
entry count = 0
General Graph Directory
Issues
Cycle detection algorithms
Garbage collection algorithms
Only allow links to files, not directories
Mount Points
• Definitions
– Mount: Loading a remote
file system for local access
– Mount Point: the path
point where a remote file
system merges with the
local structure
• Top figure: un-mounted
file systems
• Bottom figure: The top
right file system mounted over
the users directory of the file
system of the top left
File Sharing
Files are shared by users locally, and over networks, and grids
• Sharing protection: user and group identifications and access codes
• Client Server Network Models: Network File System (NSF) or CISF
(Windows Common Internet File System) using remote procedure calls
• Consistency for simultaneous access
• Remote File Transfer: FTP (WinSCP)
• Remote Login: TELNET (PuTTY)
• Issues
–
–
–
–
–
Handling network and server failure.
Transaction based systems
Stateless protocol: easy recovery, but less security
State-based protocols: difficult recovery, better security
Establishing when updates become visible to other users
Access Control
•
•
•
•
File owner/creator controls: what can be done by whom
Types of access (Read, Write, Execute, Append, Delete, List)
Mode of access: read, write, execute
Three classes of users and examples of access rights
a) owner access (u)
7

1 1 1 (RWX)
b) group access (g)
6

1 1 0 (RW)
c) public access (o)
1

0 0 1 (X)
• System administrator creates group names and adds lists of users to it.
• Owner defines access to a particular file (say game) or subdirectory
Command to set access rights to a file:
Owner (user)
group
Public (other)
chmod 761 game
Example: chmod u+rwx g+rw o+x game
Example: chmod g-x u-rw game
Example: chmod u=rwx g=rw o=x game
Associate file game with group staff: chgrp staff game
File System Transient Data in Memory
(a) Opening a file (b) Reading a file
Directory Structure Alternatives
• Simple List: names & disk pointers
– simple to program
– O(n) search time
• Hashed
– O(1) directory search time
– collisions possible
– fixed hash table size
• Other alternatives
– Separate chaining
– Sorted list O(lg n) find; O(n) deletion
Directory
File
Start
Length
0
2
tr
14
3
Mail
19
6
list
28
4
6
2
Count
f
Simple list structure
Contiguous allocation
Allocating Space for Files
Contiguous allocation
• Each file occupies a set of
contiguous blocks on the disk
• Simple – Only starting block # and
number of blocks are required
• Both random and sequential
access is possible
• External fragmentation (holes)
• Files cannot grow; adjacent space
might be allocated
• Some systems allocate in groups
of blocks (extents or clusters).
Files are linked lists of these
contiguous allocations
Logical to physical translation of record R
Block = start + R*record size/block size
Offset = R*record size % block size
Linked Allocation
of File Space
• Files are linked lists of blocks:
blocks may be anywhere
• Simple – Only need a directory’s
starting address
• No external fragmentation, but
no random access
• File-allocation table (FAT) used
by MS-DOS and OS/2 has a
chain of available clusters of
blocks
• Caching reduces disk seeks
Location of record R
Block = located by linked list traversal
Offset = R*record size % block size
FAT
Free block count
Indexed Allocation of File Space
• index block contains
block pointers
• Index table must be
maintained and is linked
• Random access possible
• Allows dynamic access
without external
fragmentation
• Index table can be
cached
Location of record R
Block = located by index table lookup
Offset = R*record size % block size
Multi-level Indexed Allocation
of File Space
Inode

outer-index
index table
file
UNIX (4K bytes per block)
Management of Free Space
• Bit vector (bit per block; 0=free)
– Extra space needed
– Example:
bit/block size = 4096
disk size = 1 gigabyte
space = 230/(212*23)= 32 KB
– Easy to find groups of free blocks
• Linked list (free list)
– Finding contiguous space hard
– No waste of space
• Grouping: separate lists ordered
by contiguous block size
• Counting: A Linked list contains
block #s + a count of adjacent free
blocks
• Issue: Maintaining consistency
between memory structures and
those on disk
Efficiency
• Efficiency dependent on:
– Allocate and access
algorithms
– FCB’s and directory content
– Caching
• Caching
– By Buffer: cache disk blocks in
separate section of memory
– By Page: cache pages using
virtual memory techniques.
(Memory-mapped I/O)
• Algorithm Optimizations
– Use free-behind (release
previously read blocks) and
read-ahead replacement to
optimize sequential access
– Dedicate section of memory as
a virtual disk (RAM disk).
Various Disk-Caching Locations
Unified and
Non-unified
Buffer Cache
• Page cache: holds pages,
rather than disk blocks
Unified
Buffer
• Buffered Cache: holds
recently used disk blocks
• Unified Buffer Cache:
Same cache for both file I/O
and Memory Mapped Files
• Non Unified Buffer Cache:
Separate page cache for
Memory Mapped Files and
for file I/O. Requires extra
copying
No Unified
Buffer
Reliability
• Consistent back up procedures
– System programs perform full or incremental back ups
– Data recovery recovers any lost data from a back up device
• Consistency checking on reboot
– Inconsistent directory/block allocations automatically repaired
• Log structured (or journaling) to minimize seeks
– Write file system operations to a transaction on a circular
buffer (or log).
– Transaction committed after log write operations complete
– A background task processes log transactions
• Asynchronously updates the file system
• Deletes appropriate log records after the update completes
– After a crash, the system finishes any partial operations
The Sun Network File System (NFS)
Software specification for accessing remote files across LAN or WAN
• Networked system view: independent and heterogeneous
• Sharing of file systems: transparent to users
• Mount operations:
– require specifying the host IP address
– Remote directories are mounted over any local file system directory;
they hide the directories and subdirectories over which they mount
– Cascading mounts: locally mount over other mounted file systems. Users
do not get access to subdirectories remotely mounted over remote
directories
• Implementation:
– Remote Procedure calls (RPC) & External Data Representation (XDR) protocol
– Servers are stateless but maintain client lists for server shutdowns
NFS Mounting
Purpose: Establish connections
Mount operation
• usr/shared mounts over usr/local
• User loses access to local
Cascaded mount operation
• usr/dir2 mounts over usr/local/dir1
• Now dir2 hides dir1
Three independent file systems
Pseudo code
Establish connection with server
Request name of remote directory to mount
Server returns file handle, containing file-system identifier/inode number
User view changes and the remote file system becomes available
Remote procedure calls for file/directory operations available
NFS Protocol
NFS servers
• Uses buffering (server side)
and caching (client side). The
local kernel checks if the local
cache is up to date
• All operations are
synchronous
• Utilizes RPC calls
• 1-1 API with UNIX system
calls (except open, close)
• NO concurrency-control
• Request are stateless, with a
full set of arguments
NFS Path-Name Translation
• Performed by breaking the full path into path
component names and performing a separate
NFS lookup call for every pair: path component
name and directory virtual node (vnode)
• To make lookup faster, a directory name lookup
cache on the client’s side holds the vnodes for
remote directory names
Download