Other File Systems: AFS, Napster Recap • NFS: – Server exposes one or more directories • Client accesses them by mounting the directories – Stateless server • Has problems with cache consistency, locking protocol – Mounting protocol • Automounting • P2P File Systems: – PAST, CFS – Relies on DHTs for routing 2 Andrew File System (AFS) • Named after Andrew Carnegie and Andrew Mellon – Transarc Corp. and then IBM took development of AFS – In 2000 IBM made OpenAFS available as open source • Features: – – – – – – – Uniform name space Location independent file sharing Client side caching with cache consistency Secure authentication via Kerberos Server-side caching in form of replicas High availability through automatic switchover of replicas Scalability to span 5000 workstations 3 AFS Overview • Based on the upload/download model – Clients download and cache files – Server keeps track of clients that cache the file – Clients upload files at end of session • Whole file caching is central idea behind AFS – Later amended to block operations – Simple, effective • AFS servers are stateful – Keep track of clients that have cached files – Recall files that have been modified 4 AFS Details • Has dedicated server machines • Clients have partitioned name space: – Local name space and shared name space – Cluster of dedicated servers (Vice) present shared name space – Clients run Virtue protocol to communicate with Vice • Clients and servers are grouped into clusters – Clusters connected through the WAN • Other issues: – Scalability, client mobility, security, protection, heterogeneity 5 AFS: Shared Name Space • AFS’s storage is arranged in volumes – Usually associated with files of a particular client • AFS dir entry maps vice files/dirs to a 96-bit fid – Volume number – Vnode number: index into i-node array of a volume – Uniquifier: allows reuse of vnode numbers • Fids are location transparent – File movements do not invalidate fids • Location information kept in volume-location database – Volumes migrated to balance available disk space, utilization – Volume movement is atomic; operation aborted on server crash 6 AFS: Operations and Consistency • AFS caches entire files from servers – Client interacts with servers only during open and close • OS on client intercepts calls, and passes it to Venus – Venus is a client process that caches files from servers – Venus contacts Vice only on open and close • Does not contact if file is already in the cache, and not invalidated – Reads and writes bypass Venus • Works due to callback: – Server updates state to record caching – Server notifies client before allowing another client to modify – Clients lose their callback when someone writes the file • Venus caches dirs and symbolic links for path translation 7 AFS Implementation • Client cache is a local directory on UNIX FS – Venus and server processes access file directly by UNIX i-node • Venus has 2 caches, one for status & one for data – Uses LRU to keep them bounded in size 8 Napster • Flat FS: single-level FS with no hierarchy – Multiple files can have the same name • All storage done at edges: – Hosts export set of files stored locally – Host is registered with centralized directory • Uses keepalive messages to check for connectivity – Centralized directory notified of file names exported by the host • File lookup: client sends request to central directory – Directory server sends 100 files matching the request to client – Client pings each host, computes RTT and displays results – Client transfers files from the closest host • File transfers are peer-to-peer; central directory not part 9 Napster Architecture Napster Directory Server 1 H1 H2 Firewall Network IP Sprayer/ Redirector Napster.com H3 Napster Directory Server 2 Napster Directory Server 3 10 Napster Protocol Napster Directory Server 1 H1 I have “metallica / enter sandman” H2 Network Firewall IP Sprayer/ Redirector Napster.com H3 Napster Directory Server 2 Napster Directory Server 3 11 Napster Protocol Napster Directory Server 1 H1 I have “metallica / enter sandman” H2 Network “who has metallica ?” Firewall IP Sprayer/ Redirector “check H1, H2” Napster.com H3 Napster Directory Server 2 Napster Directory Server 3 12 Napster Protocol Napster Directory Server 1 H1 I have “metallica / enter sandman” H2 ping ping Network “who has metallica ?” Firewall IP Sprayer/ Redirector “check H1, H2” Napster.com H3 Napster Directory Server 2 Napster Directory Server 3 13 Napster Protocol Napster Directory Server 1 H1 I have “metallica / enter sandman” H2 ping ping Network “who has metallica ?” Firewall IP Sprayer/ Redirector “check H1, H2” transfer Napster.com H3 Napster Directory Server 2 Napster Directory Server 3 14 Napster Discussion • Issues: – – – – Centralized file location directory Load balancing Relies on keepalive messages Scalability an issue! • Success: ability to create and foster an online community – Built in ethics – Built in faults – Communication medium • Had around 640000 users in November 2000! 15 Other P2P File Systems • Napster has a central database! – Removing it will make regulating file transfers harder • Freenet, gnutella, kazaa … all are decentralized • Freenet: anonymous, files encrypted – So not know which files stored locally, which file searched • Kazaa: allows parallel downloads • Torrents for faster download 16