Outline • Distributed File Systems – continued 5/29/2016 COP5611

advertisement
Outline
• Distributed File Systems – continued
5/29/2016
COP5611
1
Distributed File Systems
• A distributed file system is a resource
management component in a distributed
operating systems
– It implements a common file system shared by
all the computers in the systems
• Two important goals
– Network transparency
– High availability
5/29/2016
COP5611
2
Architecture
5/29/2016
COP5611
3
Architecture – cont.
• Normally for performance reasons distributed
file systems are organized as a client-server
architecture
– File servers store files and perform storage and
retrieval upon client’s requests
– Two most important parts are
• Name server
• Cache manager
5/29/2016
COP5611
4
Architecture – cont.
5/29/2016
COP5611
5
Architecture – cont.
5/29/2016
COP5611
6
Mounting
• Mounting is a way to bind together different
file systems to form a single hierarchical
structured name space
– It is widely used in both local and distributed
UNIX machines
– In distributed file systems, file systems
maintained by remote servers are mounted at the
clients
5/29/2016
COP5611
7
Mounting – cont.
5/29/2016
COP5611
8
Mounting – cont.
5/29/2016
COP5611
9
Mounting – cont.
5/29/2016
COP5611
10
Mounting – cont.
5/29/2016
COP5611
11
Automounting
5/29/2016
COP5611
- cont.
12
Automounting – cont.
5/29/2016
COP5611
13
Caching
• Caching is commonly used in distributed file
systems to reduce delays in accessing the
data
– In file caching, a copy of the data stored at a
remote file server is brought to the client,
reducing access delays due to network latency
– The effectiveness of caching is based on the
temporal locality in programs
– Files can also be cached at the server side
5/29/2016
COP5611
14
Client Caching
5/29/2016
COP5611
15
Client Caching – cont.
5/29/2016
COP5611
16
Cache Consistency
5/29/2016
COP5611
17
Hints
• An alternative approach to caching
– The cached data is treated as hints
– The cached data is not guaranteed to be
completely accurate
• The cache consistency issue is ignored in this
implementation
– This is useful for applications which can recover
from invalid cached data
5/29/2016
COP5611
18
Bulk Data Transfer
• Bulk data transfer is to transfer multiple data
blocks instead of just the block being
referenced by the client
– Temporal locality and the fact that most files are
accessed in their entirety
– Reduce the network communication overhead by
reducing the cost of executing communication
protocols
5/29/2016
COP5611
19
Security
5/29/2016
COP5611
20
Naming in Distributed File Systems
• A name in file systems is a way to reference a
file or a directory
• Name resolution refers to the process of
mapping a name to an object (or in the case
of replication, to multiple objects)
• A name space is a collection of names
5/29/2016
COP5611
21
Naming in a Local File System
5/29/2016
COP5611
22
Naming in a Local File System – cont.
5/29/2016
COP5611
23
Naming in Distributed File Systems – cont.
• Three approaches to naming in distributed
file systems
– The simplest scheme is to concatenate the host
name to the names of files
• Not network transparent
• Not location-independent
– Mounting remote directories to local directories
• Location transparent but not network transparent
– A single global directory
• Limited to a few cooperating computers
5/29/2016
COP5611
24
Naming in Distributed File Systems – cont.
• Context
– Content can be used to partition a file name space
– Here a filename consists of a context and a name
local to the context
– Name resolution involves interpreting the name
within a context, which may invoke other contexts
recursively
5/29/2016
COP5611
25
Naming in Distributed File Systems – cont.
• Name Servers are responsible for name
resolution in distributed file systems
– A name server is a process that maps names
specified by clients to stored objects such as files
and directories
– A single name server vs. multiple name servers
5/29/2016
COP5611
26
Caches on Disk or Memory
• Cache in main memory vs. cache on a local
disk
– Cache in main memory
• Advantages
• Disadvantages
– Cache on a local disk
• Advantages
• Disadvantages
5/29/2016
COP5611
27
Writing Policy
• This is related to the cache consistency
– It decides what to do when a cache block at the
client is modified
– Several different policies
• Write-through
• Delayed writing policy for some time
– Delayed writing policy when the file is closed
5/29/2016
COP5611
28
Cache Consistency
• Schemes to guarantee consistency
– Server-initiated approach
• Servers inform the cache managers whenever the data
in client caches become stale
• Cache managers can retrieve the new data when
needed
– Client-initiated approach
• Cache managers validate data with the server before
returning it to the clients
– Limited caching
5/29/2016
COP5611
29
Availability
• Availability is an important issue in
distributed file systems
– Replication is the primary mechanism for
enhancing the availability of files in distributed
file systems
• Replication
– Unit of replication
– Replica management
5/29/2016
COP5611
30
Scalability
• Scalability deals with the suitability of the
design to support more clients
–
–
–
–
Caching helps reduce the client response time
Server-initiated cache invalidation
Some clients can be used as servers
The structure of the server process also plays a
major role in scalability
5/29/2016
COP5611
31
Semantics
• Semantics of a file system characterize the
effects of accesses on files
– For example, a read operation should return the
data (stored) due to the latest write operation
– Guaranteeing the semantics when employing
caching, is difficult and expensive
5/29/2016
COP5611
32
Case Studies
• There are distributed file systems that have been
developed
–
–
–
–
–
–
–
–
Architecture
Communication
Processes and their organization
Naming
Consistency
Caching and replication
Fault tolerance
Security
5/29/2016
COP5611
33
Case Studies – cont.
• These examples are taken from Tanenbaum
and van Steen’s book
– It does not follow the main textbook because
• The material there is not up-to-date
• It did not provide enough details
• You need to read Sect. 9.5
5/29/2016
COP5611
34
Sun Network File Systems
• Developed by Sun
–
–
–
–
The first version was developed and was kept to Sun
The second version was incorporated in SunOS 2.0
Version 3 was released around 1994
Version 4 is under development to make the NFS a
true wide-area file system across the Internet
5/29/2016
COP5611
35
Sun Network File Systems – cont.
• NFS is not a true file system
– It is a collection of protocols that together
provide clients with a model of a distributed files
system
• In some sense, it is a middle ware (like CORBA)
– NFS protocols are designed in such a way that
different implementations should easily
interoperate
• It can run on a heterogeneous collection of computers
5/29/2016
COP5611
36
NFS Architecture (1)
a)
b)
The remote access model.
The upload/download model
5/29/2016
COP5611
37
NFS Architecture (2)
5/29/2016
COP5611
38
File System Model
Operation
v3
v4
Description
Create
Yes
No
Create a regular file
Create
No
Yes
Create a nonregular file
Link
Yes
Yes
Create a hard link to a file
Symlink
Yes
No
Create a symbolic link to a file
Mkdir
Yes
No
Create a subdirectory in a given directory
Mknod
Yes
No
Create a special file
Rename
Yes
Yes
Change the name of a file
Rmdir
Yes
No
Remove an empty subdirectory from a directory
Open
No
Yes
Open a file
Close
No
Yes
Close a file
Lookup
Yes
Yes
Look up a file by means of a file name
Readdir
Yes
Yes
Read the entries in a directory
Readlink
Yes
Yes
Read the path name stored in a symbolic link
Getattr
Yes
Yes
Read the attribute values for a file
Setattr
Yes
Yes
Set one or more attribute values for a file
Read
Yes
Yes
Read the data contained in a file
Write
Yes
Yes
Write data to a file
5/29/2016
COP5611
39
Communication
a)
b)
Reading data from a file in NFS version 3.
Reading data using a compound procedure in version 4.
5/29/2016
COP5611
40
Processes
• NFS is a traditional client-server system
– In versions 2 and 3, the NFS servers are stateless
• in that they are not required to maintain any client
state
• The main advantage of the stateless approach is
simplicity
– However, some of operations are intrinsically stateful and
they are implemented in a separate lock manager
– Version 4 uses a stateful approach
5/29/2016
COP5611
41
Naming (1)
5/29/2016
COP5611
42
Naming (2)
5/29/2016
COP5611
43
File Attributes (1)
Attribute
Description
TYPE
The type of the file (regular, directory, symbolic link)
SIZE
The length of the file in bytes
CHANGE
Indicator for a client to see if and/or when the file has
changed
FSID
Server-unique identifier of the file's file system
5/29/2016
COP5611
44
File Attributes (2)
Attribute
Description
ACL
an access control list associated with the file
FILEHANDLE
The server-provided file handle of this file
FILEID
A file-system unique identifier for this file
FS_LOCATIONS
Locations in the network where this file system may be found
OWNER
The character-string name of the file's owner
TIME_ACCESS
Time when the file data were last accessed
TIME_MODIFY
Time when the file data were last modified
TIME_CREATE
Time when the file was created
5/29/2016
COP5611
45
Semantics of File Sharing (1)
a)
b)
On a single processor, when a read
follows a write, the value returned by the
read is the value just written.
In a distributed system with caching,
obsolete values may be returned.
5/29/2016
COP5611
46
Semantics of File Sharing (2)
Method
Comment
UNIX semantics
Every operation on a file is instantly visible to all processes
Session semantics
No changes are visible to other processes until the file is closed
Immutable files
No updates are possible; simplifies sharing and replication
Transaction
All changes occur atomically
5/29/2016
COP5611
47
File Locking in NFS (1)
Operation
Description
Lock
Creates a lock for a range of bytes
Lockt
Test whether a conflicting lock has been granted
Locku
Remove a lock from a range of bytes
Renew
Renew the leas on a specified lock
5/29/2016
COP5611
48
File Locking in NFS (2)
Current file denial state
Request
access
NONE
READ
WRITE
BOTH
READ
Succeed
Fail
Succeed
Succeed
WRITE
Succeed
Succeed
Fail
Succeed
BOTH
Succeed
Succeed
Succeed
Fail
(a)
Requested file denial state
Current
access
state
NONE
READ
WRITE
BOTH
READ
Succeed
Fail
Succeed
Succeed
WRITE
Succeed
Succeed
Fail
Succeed
BOTH
Succeed
Succeed
Succeed
Fail
(b)
5/29/2016
COP5611
49
Client Caching (1)
5/29/2016
COP5611
50
Client Caching (2)
5/29/2016
COP5611
51
RPC Failures
5/29/2016
COP5611
52
Security
5/29/2016
COP5611
53
Secure RPCs
5/29/2016
COP5611
54
Access Control
Operation
Description
Read_data
Permission to read the data contained in a file
Write_data
Permission to to modify a file's data
Append_data
Permission to to append data to a file
Execute
Permission to to execute a file
List_directory
Permission to to list the contents of a directory
Add_file
Permission to to add a new file t5o a directory
Add_subdirectory
Permission to to create a subdirectory to a directory
Delete
Permission to to delete a file
Delete_child
Permission to to delete a file or directory within a directory
Read_acl
Permission to to read the ACL
Write_acl
Permission to to write the ACL
Read_attributes
The ability to read the other basic attributes of a file
Write_attributes
Permission to to change the other basic attributes of a file
Read_named_attrs
Permission to to read the named attributes of a file
Write_named_attrs
Permission to to write the named attributes of a file
Write_owner
Permission to to change the owner
Synchronize
Permission to to access a file locally at the server with synchronous reads and writes
5/29/2016
COP5611
55
Sun Network File Systems – cont.
• Setting a Network File System on Unix systems
– NFS software must have installed
– Server side
• Start NFS daemons
• Export directories
– /etc/exports
– Client side
• /etc/fstab
• mount -a
5/29/2016
COP5611
56
Coda Distributed File System
• Developed at CMU
– Integrated now with a number of UNIX systems
– Based on the Andrew File System (AFS)
• Design goals
–
–
–
–
Scalable
Secure
Highly available
High degree of naming and location transparency
• So that the system would appear to its users as a pure
local file system
5/29/2016
COP5611
57
The Coda File System
Type of user
Description
Owner
The owner of a file
Group
The group of users associated with a file
Everyone
Any user of a process
Interactive
Any process accessing the file from an interactive terminal
Network
Any process accessing the file via the network
Dialup
Any process accessing the file through a dialup connection
to the server
Batch
Any process accessing the file as part of a batch job
Anonymous
Anyone accessing the file without authentication
Authenticated
Any authenticated user of a process
Service
Any system-defined service process
5/29/2016
COP5611
58
Overview of Coda (1)
5/29/2016
COP5611
59
Overview of Coda (2)
5/29/2016
COP5611
60
Communication (1)
5/29/2016
COP5611
61
Communication (2)
a)
b)
Sending an invalidation message one at a time.
Sending invalidation messages in parallel.
5/29/2016
COP5611
62
Naming
5/29/2016
COP5611
63
File Identifiers
5/29/2016
COP5611
64
Sharing Files in Coda
5/29/2016
COP5611
65
Transactional Semantics
File-associated data
Read?
Modified?
File identifier
Yes
No
Access rights
Yes
No
Last modification time
Yes
Yes
File length
Yes
Yes
File contents
Yes
Yes
• The metadata read and modified for a store session type in Coda.
5/29/2016
COP5611
66
Client Caching
5/29/2016
COP5611
67
Server Replication
5/29/2016
COP5611
68
Disconnected Operation
5/29/2016
COP5611
69
Secure Channels (1)
• Mutual authentication in RPC2.
5/29/2016
COP5611
70
Secure Channels (2)
• Setting up a secure channel between a (Venus) client and a
Vice server in Coda.
5/29/2016
COP5611
71
Access Control
Operation
Description
Read
Read any file in the directory
Write
Modify any file in the directory
Lookup
Look up the status of any file
Insert
Add a new file to the directory
Delete
Delete an existing file
Administer
Modify the ACL of the directory
• Classification of file and directory operations recognized by
Coda with respect to access control.
5/29/2016
COP5611
72
Plan 9: Resources Unified to Files
5/29/2016
COP5611
73
Communication
File
Description
ctl
Used to write protocol-specific control commands
data
Used to read and write data
listen
Used to accept incoming connection setup requests
local
Provides information on the caller's side of the connection
remote
Provides information on the other side of the connection
status
Provides diagnostic information on the current status of the connection
• Files associated with a single TCP connection in Plan 9.
5/29/2016
COP5611
74
Processes
5/29/2016
COP5611
75
Naming
5/29/2016
COP5611
76
Overview of xFS.
5/29/2016
COP5611
77
Processes (1)
5/29/2016
COP5611
78
Processes (2)
5/29/2016
COP5611
79
Naming
• Main data structures used in xFS.
Data structure
Description
Manager map
Maps file ID to manager
Imap
Maps file ID to log address of file's inode
Inode
Maps block number (i.e., offset) to log address of block
File identifier
Reference used to index into manager map
File directory
Maps a file name to a file identifier
Log addresses
Triplet of stripe group, ID, segment ID, and segment offset
Stripe group map
Maps stripe group ID to list of storage servers
5/29/2016
COP5611
80
Overview of SFS
• The organization of SFS.
5/29/2016
COP5611
81
Naming
/sfs
LOC
HID
Pathname
/sfs/sfs.vu.sc.nl:ag62hty4wior450hdh63u623i4f0kqere/home/steen/mbox
• A self-certifying pathname in SFS.
5/29/2016
COP5611
82
Summary
•
Issue
NFS
Coda
Plan 9
xFS
SFS
Design goals
Access transparency
High availability
Uniformity
Serverless system
Scalable security
Access model
Remote
Up/Download
Remote
Log-based
Remote
Communication
RPC
RPC
Special
Active msgs
RPC
Client process
Thin/Fat
Fat
Thin
Fat
Medium
Server groups
No
Yes
No
Yes
No
Mount granularity
Directory
File system
File system
File system
Directory
Name space
Per client
Global
Per process
Global
Global
File ID scope
File server
Global
Server
Global
File system
Sharing sem.
Session
Transactional
UNIX
UNIX
N/S
Cache consist.
write-back
write-back
write-through
write-back
write-back
Replication
Minimal
ROWA
None
Striping
None
Fault tolerance
Reliable comm.
Replication and
caching
Reliable comm.
Striping
Reliable comm.
Recovery
Client-based
Reintegration
N/S
Checkpoint & write
logs
N/S
Secure channels
Existing
mechanisms
Needham-Schroeder
Needham-Schroeder
No pathnames
Self-cert.
Access control
Many operations
Directory operations
UNIX based
UNIX based
NFS BASED
A comparison between NFS, Coda, Plan 9, xFS. N/S indicates that nothing has been specified.
5/29/2016
COP5611
83
Download