Ke Wang
CS614 – Advanced System
Apr 24, 2001
Key requirements of distributed system
Scalability from small to large networks
Fast and transparent access to geographically
Distributed File System(DFS)
Information protection
Ease of administration
Wide support from variety of vendors
DFS -- a distributed implementation of a file system, where multiple users share files and storage resources.
Overall storage space managed by a DFS is composed of different, remotely located, smaller storage spaces
There is usually a correspondence between constituent storage spaces and sets of files
Service - a software entity providing a particular type of function to client
Server - service software running on a single machine
Client - process that can invoke a service using a set of operations that forms its client interface
Retaining most recently accessed disk blocks.
Repeated accesses to a block in cache can be handled without involving the disk .
Advantages
- Reduce delays
- Reduce contention for disk arm
Advantages
Reduce network traffic
Reduce server contention
Problems
Cache-consistency
Stuff to consider
Cache location (disk vs. memory)
Cache Placement (client vs. server)
Cache structure (block vs. file)
Stateful vs. Stateless server
Cache update policies
Consistency
Client-driven vs. Server-driven protocols
Practical Distributed System
NFS: Sun’s Network File System
AFS: Andrew File System (CMU)
Sprite FS: File System for the Sprite OS
( UC Berkeley)
Sun’s Network File System(NFS)
Sun’s Network File System(NFS)
Originally released in 1985
Build on top of an unreliable datagram protocol UDP (change to TCP now)
Client-server model
Developed at CMU since 1983
Client-server model
Key software: Vice and Venus
Goal : high scalability (5,000-10,000 nodes)
VICE is a multi-threaded server process with each thread handling a single client request
VENUS is the client process that runs on each workstation which forms the interface with
VICE
User-level processes
One process for one client
Client cache file
Verify timestamp every open
-> a lot of interaction with server
-> heavy network traffic
To improve prototype
Reduce cache validity check
Reduce server processes
Reduce network traffic
Higher scalability!
Designed for networked workstation with large physical memories
(can be diskless)
Expect memory of 100-500Mbytes
Goal : high performance
When a process makes a file access, it is presented first to the cache( file traffic disk traffic ), or to the server where the file is stored(
). If not satisfied, request is passed either to a local disk, if the file is stored locally( server traffic ). Servers also maintain caches to reduce disk traffic.
Two unusual aspects
Guarantee complete consistent view
Concurrent write sharing
Sequential write sharing
Cache size varies dynamically
Cache Location
Disk vs. Main Memory
Advantages of disk caches
More Reliable
Cached data are still there during recovery and don’t need to be fetched again
Cache Location
Disk vs. Main Memory(cont)
Advantages of main-memory caches:
Permit workstations to be diskless
More quick access
Server caches(used to speed up disk I/O) are always in main memory; using mainmemory caches on the clients permits a single caching mechanism for servers and users
Cache Placement
Client vs. Server
Client cache reduce network traffic
Read-only operations on unchanged files do not need go over the network
Server cache reduce server load
Cache is amortized across all clients ( but needs to be bigger to be effective)
In practice, need BOTH!
Block basis
Simple
Sprite FS, NFS
File basis
Reduce interaction with servers
AFS
Cannot access files larger than cache
NFS : client memory(disk), block basis
AFS : client disk, file basis
Sprint FS : client memory, server memory, block basis
Stateful – Servers hold information about the client
Stateless – Servers maintain no state information about clients
Mechanism
Client opens a file
Server fetches information about the file from its disk, store in memory, gives client a unique connection id and open file id is used for subsequent accesses until the session ends
Advantages:
Fewer disk access
Read-ahead possible
RPCs are small, contains only an id
File may be cached entirely on client, invalidated by the server if there is a conflicting write
Disadvantage:
Server loses all its volatile state in crash
Restore state by dialog with clients, or abort operations that underway when crash occurred
Server needs to be aware of client failures
Each request must be self-contained
Each request identifies the file and position in the file
No need to establish and terminate a connection by open and close operations
Advantage
A file server crash does not affect clients
Simple
Disadvantage
Impossible to enforce consistency
RPC needs to contain all state, longer
AFS and Sprite FS are stateful
Sprite FS servers keep track of which clients have which files open
AFS servers keep track of the contents of client’s caches
NFS is stateless
Write-through
Delayed-write
Write-on-close (variation of delayedwrite)
Write-through – all writes be propagated to stable storage immediately
Reliable, but poor performance
Delayed-write – modification written to cache and then written through to server later
Write-on-close – modification written back to server when file close
Reduces intermediate read and write traffic while file is open
Pros for delayed-write/write-on-close
Lots of files have lifetimes of less than 30s
Redundant writes are absorbed
Lots of small writes can be batched into larger writes
Disadvantage:
Poor reliability; unwritten data may be lost when client crash
Key to Andrew’s scalability
Client cache entire file in disk
Write-on-close
Server load and network traffic reduced
Contacts server only on open and close
Retain across reboots
Require local disk, large enough
NFS and Sprite delayed-write
Delay 30 seconds
AFS write-on-close
Reduce traffic to server dramatically
Good scalability of AFS
Is locally cached copy of data consistent with the master copy?
Is there danger of “stale” data?
Permit concurrent write sharing?
Concurrent Write Share
A file open on multiple clients
At least one client write
Server detects
Require write back to server
Invalidate open cache
Sequential Write Sharing
A file modified, closed, opened by others
Out-of-date blocks
Compare version number with server
Current data in other’s cache
Keep track of last writer
Session semantics in AFS
Writes to an open file invisible to others
Once file closed, changes visible to new opens anywhere
Other file operations visible immediately
Only guarantee sequential consistency
Sprite guarantees complete consistency
AFS uses session semantics
NFS not guarantee consistency
NFS is stateless. All operations involve contacting the server; if server is unreachable, read & write cannot work
Client-driven vs. Server-driven
Client-driven approach
Client initiates validity check
Server check whether the local data are consistent with master copy
Server-driven approach
Server records files client caches
When server detect inconsistency, it must react
Callback (key to scalability)
Cache valid if have callback on
Server notify before modification
When reboot, all suspect reduces cache validation requests to server
Client-driven vs. Server-driven
AFS is server-driven (callback)
Contributes to AFS’s scalability
Whole file caching and session semantics also help
NFS and Sprite are client-driven
Increased load on network and server
Make client cache as large as possible
Virtual memory and file system negotiate
Compare age of oldest page
Two problems
Double caching
Multiblock pages
Estimated improvement is small
Reason
Andrew is user-level process
Sprite is kernel-level implementation
Performance – running time
Performance – running time
Use Andrew benchmark
Sprite system is fastest
Kernel-to-kernel PRC
Delayed write
Kernel implementation (AFS is user-level)
Performance – CPU utilization
Performance – CPU utilization
Use Andrew benchmark
Andrew system showed greatest scalability
File-based cache
Server-driven
Use of callback
New issues
If client become disconnected?
Weakly connected(by modem)?
Violate key property: transparency!
Cache misses may impede progress
Local update invisible remotely
Update conflict
Update vulnerable to loss, damage
Coda file system