NFS & Distributed Systems Issues Vivek Pai Dec 12, 2002

NFS & Distributed Systems Issues Vivek Pai Dec 12, 2002 The Next Project Behavioral spec  Implementation up to you  Can assume max of 32 procs/threads  Use a simple counter to implement simple counts  I may release a tool to test easier   But feel to use ApacheBench, etc 2 Behavioral Spec The following behavioral spec is important  If there aren’t enough free processes/threads, the server should spawn one per second  If there are too many free, one should be killed per second  This should not depend on any other activity in the system 3 Caching Mmap Always use mmap  Keep cache of active & inactive maps  Total cache size in KB should be limited by command-line argument  Can only exceed this limit if all mappings are active  4 Man Pages You May Like Mmap, munmap  Man –k pthread  Flock/lockf  Sleep  Signal  Alarm  5 Being A Good User Do not fork wildly  Try to test on non-shared system  6 Imagine The Following Everyone has a desktop machine  Each machine has a user  Each user has a home directory  What problems arise?  Can’t move between machines  Can’t easily share files with others  How does this data get backed up?  7 Was It Always Like This? No  Think mainframes:  Big, centralized box  All disks attached  Programs ran on box  Only terminals/monitors on each desk  8 How Did We Get Here? Mainframe killers advocated little boxes  Lots of little boxes are a distributed system  Distributed systems introduce new problems  9 Why Use Little Boxes?  Little boxes are cheap   Little boxes are disposable   Easier to order a PC than a mainframe No need for a maintenance contract Economy of scale  Design cost amortized over more units 10 Were Minis Immune? Minicomputers were “department”sized versus “company”-sized  Most information not shared among everyone  Administrator per department OK  Shared resources only within department OK  11 Why Not Just Shared Disk?  Centralized storage Easier  Better  Easier  Easier   administration/backup use of capacity to build large filesystem cache to provide AC/power Problem: compare bandwidth 10 Mbit/sec Ethernet at the time  Switched versus shared irrelevant  12 New Problem  Single point of failure Means everything depends on this item  In other cases, duplication helps   Common failures = reboot But all information (state) lost  All clients would have to be told  We’d need to keep track of all clients  • On stable storage! 13 Toward Statelessness Make server as dumb as possible  Shift burdens to client-side  Client failure only harms that client  Each operation is self-contained  Repeating operations permissible   Idempotent – repeating causes no change 14 Idempotency  Regular Unix system call write(fd, buf, size)  Writes size bytes at current position, moves position forward by size   Idempotent version pwrite(fd, buf, size, offset)  Idempotent operations in NFS hidden from user programs  15 Distributed Caching Local filesystems have caches  Use caches to offload network traffic  Same object replicated in many caches  No problem for reads   What happens on write/update? Multiple different copies of data?  What happens if it’s metadata?  16 Distributed Write Problem  Possible approaches  Disallow caching on writes • What about emacs?  Disallow caching of shared files • What happens for really big files?   Disallow caching of metadata writes What disk blocks does OS care about? 17 Sun’s Write Philosophy File block write sharing not an issue  Very few programs do it  Correctness depends on program  Reduce window of opportunity  Flush dirty blocks periodically  Flush can be asynchronous  18 Metadata Operations Performed synchronously at server  Must be reflected to disk  Why: stability  Overhead: disk op + network   Can we speed up synchronous ops? 19 New Statelessness Problems  Stale file handle problem cd ~vivek/temp1/temp in window A  rm –r ~vivek/temp1 in window B  “ls” in window A   Stale inode problem Machine A gets file for read  Filesystem reformatted by admin  Machine A modifies file, tries to write  20 What Slows Down Servers  Network overhead Disk DMA in 4KB pieces  Network processing in 1500 byte packets + manipulation  Multiple CPUs   Synchronous operations  Nonvolatile memory + recovery 21

NFS & Distributed Systems Issues Vivek Pai Dec 12, 2002

Related documents

Products

Support

NFS &amp; Distributed Systems Issues Vivek Pai Dec 12, 2002

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

NFS & Distributed Systems Issues Vivek Pai Dec 12, 2002