Naming Chapter 4 Chapter 4 Naming 1 Why Naming? Names are needed to o Identify entities o Share resources o Refer to locations, etc. It is necessary to resolve names o What does human-friendly name correspond to Need a naming system In distributed systems, naming system itself is often distributed Chapter 4 Naming 2 Outline of Chapter General naming issues o Human-friendly names Naming and mobility o Such as mobile telephony Unreferenced objects o How to remove unused names Chapter 4 Naming 3 Names, Identifiers, Addresses A name is a string of bits that refers to an entity An entity is “practically anything” o Entities can be operated on To operate on entity, must know an access point Access point is just another entity! Name of an access pt is an address Chapter 4 Naming 4 Names, Identifiers, Addresses Example o Telephone is an access point o Telephone number is address Distributed systems example o Server is an access point o IP address/port number is address Mobility Chapter 4 Naming creates special problems… 5 Names, Identifiers, Addresses Address is a special kind of name o For an access point Access point associated with an entity Why not use address as entity name? o o o o Entities may change access points For example, IP address vs MAC address Entity may offer more than one access point For example, Web service with multiple servers An entity name that is independent of its address is location independent Chapter 4 Naming 6 Names, Identifiers, Addresses An identifier is a special type of name o Identifier refers to at most one entity o Entity referred to by at most one identifier o Identifier is never reused That is, identifiers are unambiguous o “Mark Stamp” is not an identifier o Telephone number is not an identifier Examples Chapter 4 Naming of identifiers? 7 Names, Identifiers, Addresses Human-friendly o File names o www.google.com o Other? Human names? unfriendly names? o Memory locations o Other? Chapter 4 Naming 8 Name Spaces Organization of names in distributed system is a name space o Represented as a labeled directed graph o Usually restricted to directed acyclic graphs Leaf node is a named entity Directory node has a table Root node has no incoming edges o Has no outgoing edge o Leaf node stores address and/or state o Can have many outgoing edges o Usually only one Chapter 4 Naming 9 Name Spaces Naming graph with single root node n0 Complete path is a path name o Begins with n0, then it’s an absolute path o Otherwise, a relative path Chapter 4 Naming 10 Name Spaces Names organized in a name space o Implies that a name is defined relative to a directory node Global name denotes same entity no matter where the name is used Local name depends on where it’s used o In other words, a local name is a relative name whose directory is known Chapter 4 Naming 11 Name Spaces Example: UNIX file system o A single root node o File directory is a directory node o File is a leaf node Chapter 4 Naming 12 Name Resolution Given path name, must be able to access the specified node o This process is name resolution How does this work? o Short answer: traverse the directed graph o Long answer: see the book But, must know where to start o For example, how to resolve 7127552339 ? o First, must know it’s a phone number o I.e., must know root node of appropriate graph Chapter 4 Naming 13 Linking and Mounting An alias is another name for something Can have multiple paths to same node, or… A symbolic link in naming graph (as above) Chapter 4 Naming 14 Linking and Mounting Combine different name spaces o Need directory node in other name space Node storing node identifier of foreign name space is mount point o Directory node in foreign name space is mounting point (usually root node) Important Chapter 4 Naming in distributed systems! 15 Linking and Mounting In distributed system… To mount foreign name space, need o Name of access protocol o Name of server o Name of mounting point Name resolution required Chapter 4 Naming 16 Linking and Mounting Consider Network File System (NFS) o Distributed file system o Discussed in detail in chapter 10 Access a file by NFS URL o o o o o For example, nfs://flits.cs.vu.nl//home/steen File (directory) /home/steen On server flits.cs.vu.nl Accessed using NFS protocol Access protocol, server, mounting point? Chapter 4 Naming 17 Linking and Mounting Consider (machine A) /remote/vu/mbox On A, nfs://flits.cs.vu.nl//home/steen Then to machine B… Chapter 4 Naming 18 Linking and Mounting DEC Global Name Service (GNS) Insert a new root node Existing names change Can avoid changing names o See book Chapter 4 Naming 19 Name Space Implementation Naming service o To add, remove, lookup names o Implemented on name server(s) o In distributed system, naming service itself may be distributed Name space is heart of naming service o Name space distribution (organization) o Name resolution Chapter 4 Naming 20 Name Space Distribution Usually organized hierarchically Assume one root Three logical layers o Global layer root and nearby (very stable) o Administrational layer directory nodes in one organization (relatively stable) o Managerial layer typically, hosts in one network (not stable) For example, DNS Chapter 4 Naming 21 Name Space Distribution DNS name space in three layers Subtle performance issues o Different requirements at each layer Chapter 4 Naming 22 Name Space Distribution Item Global Administrational Managerial Geographical scale of network Worldwide Organization Department Total number of nodes Few Many Vast numbers Responsiveness to lookups Seconds Milliseconds Immediate Update propagation Lazy Immediate Immediate Number of replicas Many None or few None Is client-side caching applied? Yes Yes Sometimes Comparison of name servers Chapter 4 Naming 23 Name Resolution Two approaches… Iterative name resolution o Server sends result back to client o More work for client Recursive name resolution o Server contacts next name server o More work for servers Chapter 4 Naming 24 Iterative Name Resolution Caching Chapter 4 Naming limited to client 25 Recursive Name Resolution Caching is more effective (next slide) More efficient communication (slide after next) Chapter 4 Naming 26 Name Resolution Server for node Should resolve Looks up Passes to child Receives and caches Returns to requester cs <ftp> #<ftp> -- -- #<ftp> vu <cs,ftp> #<cs> <ftp> #<ftp> #<cs> #<cs, ftp> ni <vu,cs,ftp> #<vu> <cs,ftp> #<cs> #<cs,ftp> #<vu> #<vu,cs> #<vu,cs,ftp> root <ni,vu,cs,ftp> #<nl> <vu,cs,ftp> #<vu> #<vu,cs> #<vu,cs,ftp> #<nl> #<nl,vu> #<nl,vu,cs> #<nl,vu,cs,ftp> Recursive name resolution of <nl, vu, cs, ftp> Name servers cache intermediate results Chapter 4 Naming 27 Name Resolution Netherlands San Jose Recursive versus iterative name resolution o Comparing communication costs Chapter 4 Naming 28 Examples DNS o o o o traditional naming service Hierarchical, rooted tree Like a white pages service for Internet You should be familiar with this Read it! X.500 directory service o Find an entity that fits a description o Like a yellow pages service Chapter 4 Naming 29 DNS Name Space Hierarchical, rooted tree Each node has 1 incoming edge (except the root) o Incoming edge used as name of node A subtree is a domain Path name is a domain name o Can be relative or absolute Node contains resource records Chapter 4 Naming 30 DNS Name Space Type of record Associated entity Description SOA Zone Holds information on the represented zone A Host Contains an IP address of the host this node represents MX Domain Refers to a mail server to handle mail addressed to this node SRV Domain Refers to a server handling a specific service NS Zone Refers to a name server that implements the represented zone CNAME Node Symbolic link with the primary name of the represented node PTR Host Contains the canonical name of a host HINFO Host Holds information on the host this node represents TXT Any kind Contains any entity-specific information considered useful Most important types of resource records Chapter 4 Naming 31 DNS Implementation DNS includes o Global layer o Administrational layer Managerial layer not formally in DNS Read the details Chapter 4 Naming 32 DNS Implementation Excerpt from DNS database for the zone cs.vu.nl Chapter 4 Naming 33 DNS Implementation Name Record type Record value cs.vu.nl NIS solo.cs.vu.nl solo.cs.vu.nl A 130.37.21.1 Description for vu.nl domain o This domain contains cs.vu.nl domain Chapter 4 Naming 34 X.500 Name Space Each record consists of o (attribute, value) pairs o Attribute can have multiple values Directory Information Base (DIB) o All entries in X.500 directory service Each attribute is a Relative Distinguished Name (RDN) o Complete record is globally unique o So it can be looked up Chapter 4 Naming 35 X.500 Name Space Attribute Abbr. Value Country C NL Locality L Amsterdam Organization L Vrije Universiteit OrganizationalUnit OU Math. & Comp. Sc. CommonName CN Main server Mail_Servers -- 130.37.24.6, 192.31.231,192.31.231.66 FTP_Server -- 130.37.21.11 WWW_Server -- 130.37.21.11 Example of X.500 directory entry o Unique name: Country, Organization, OrganizationalUnit o /C=NL/O=Vrije Universiteit/OU=Math. & Comp. Sc. o Analogous to DNS: nl.vu.cs Chapter 4 Naming 36 X.500 Name Space Globally unique names form hierarchy o Directory Information Tree (DIT) o The naming graph in X.500 Node can act as directory o More than one child o See next slide Chapter 4 Naming 37 X.500 Name Space Part of directory information tree “N” acts as directory… …and as a node Chapter 4 Naming 38 X.500 Name Space Attribute Value Attribute Value Country NL Country NL Locality Amsterdam Locality Amsterdam Organization Vrije Universiteit Organization Vrije Universiteit OrganizationalUnit Math. & Comp. Sc. OrganizationalUnit Math. & Comp. Sc. CommonName Main server CommonName Main server Host_Name star Host_Name zephyr Host_Address 192.31.231.42 Host_Address 192.31.231.66 Two entries with Host_Name as RDN Chapter 4 Naming 39 X.500 Implementation Like DNS o But more lookup operations to search DIB For example, can search for all “main servers” at Vrije Universiteit o See example in book But, need to access many leaf nodes This could be expensive! o Leaf nodes might be distributed Chapter 4 Naming 40 X.500 in the Real World X.500 o Uses Directory Access Protocol (DAP) o Runs over OSI o Therefore, it is “heavyweight” What if you like X.500… …but need to use it in the real world? Need something lightweight… Chapter 4 Naming 41 LDAP Lightweight Directory Access Protocol Application level protocol Implemented on top of TCP Lookup, update, passed as strings o No separate encoding required LDAP is defacto standard We used LDAP at my startup company Chapter 4 Naming 42 Mobility What is different in mobile case? o Names change frequently Why is this an issue? Consider DNS o Global layer and admin. layers assume names change infrequently o So replication and caching are used For mobile, something else is needed… o But what? Chapter 4 Naming 43 Mobility Consider DNS o ftp.cs.vu.nl o Local cache probably has cs.vu.nl o One request to find desired address Now spse ftp server moves o If it stays in cs.vu.nl, only local changes o What if it moves to ftp.cs.unisa.edu.au ? Chapter 4 Naming 44 Mobility Spse ftp.cs.vu.nl moves to ftp.cs.unisa.edu.au What to do? Forget about cs.vu.nl o Users won’t be happy Record new address under cs.vu.nl o If it moves locally, update is not local Turn cs.vu.nl into a symbolic link o In effect, 2 lookups o But it gets no worse if it moves again Either way, name can never change Chapter 4 Naming 45 Mobility Better idea o Give up on DNS-like approach Add an intermediate step o Assign non-human-friendly identifier Then o Name service converts human-friendly name into identifier o Location service converts identifier to current address Chapter 4 Naming 46 Naming versus Locating a) b) DNS-like mapping between name and address Two-level mapping using identifiers Chapter 4 Naming 47 Mobility But this begs the question o How to build location service? Simple solutions o Broadcasting and multicasting o Forwarding pointers Complicated (?) solutions o Home-based approaches o Hierarchical approaches Chapter 4 Naming 48 Broadcasting and Multicasting Spse mobility restricted to LAN o Broadcasting is efficient on LAN o Use ARP to locate entity o Does not scale well Can do similar thing at network layer o Use multicasting o Mobile computer gets dynamic IP address and joins multicast group o Multicast group acts as location service Chapter 4 Naming 49 Forwarding Pointers Another simple approach Forwarding pointers When moving from A to B, leave a pointer at A to new location B o Naming service still points to A Simple, yes, but… o Chain might get long o Every link in chain must be maintained Chapter 4 Naming 50 Forwarding Pointers Forwarding pointers as (proxy, skeleton) Chapter 4 Naming 51 Forwarding Pointers Redirecting a forwarding pointer o To shortcut a chain o Subsequent communication is faster Some skeletons may be left unreferenced Chapter 4 Naming 52 Home-Based Approaches Home location --- always knows current location of mobile guy o Can be used with forwarding pointers o Or with Mobile IP Mobile IP on next slide… Chapter 4 Naming 53 Home-Based Approaches Mobile IP Suppose host A is mobile o o o o o o Host A has a fixed IP address Host A has a home agent Home agent of A is at A’s fixed IP address Host A requests temp address at new location Care-of address is current location of A Home agent knows A’s care-of address Chapter 4 Naming 54 Home-Based Approaches Mobile IP Chapter 4 Naming 55 Hierarchical Approaches Network divided into domains o Called them layers in DNS Top level domain spans network Lowest level is leaf domain Directory node for each domain Chapter 4 Naming 56 Hierarchical Approaches Hierarchy of location service domains Each domain has associated directory node Chapter 4 Naming 57 Hierarchical Approaches Let Dir(D) be directory for domain D Location record in dir(D) for each entity currently in D Suppose entity E is in D Then chain of pointers to E thru higher level directories of D E might have more than one address o Due to replication Chapter 4 Naming 58 Hierarchical Approaches Entity with two addresses Chapter 4 Naming 59 Hierarchical Approaches Looking up location of E o Efficient way to find current location of E Chapter 4 Naming 60 Hierarchical Approaches a) b) To create a replica of E in domain D… Find first node that knows about E Create chain of forwarding pointers to new node Chapter 4 Naming 61 Pointer Caches Caching: good if data seldom changed If mobile, addresses change But if move is within a domain… o Then pointers at higher nodes does not change o It might make sense to cache such info How to find the right domain? o Travel regularly between LA and SJ (next slide) When to invalidate cache entry? o Travel to NY or moved there (next next slide) Chapter 4 Naming 62 Pointer Caches Caching reference to directory node of lowest-level domain Chapter 4 Naming 63 Pointer Caches Cache entry needs to be invalidated because… o Entry returns a nonlocal address o A local address is available Chapter 4 Naming 64 Scalability The biggest issue with hierarchical approach is scalability Root has to know about everybody! o Storage may be an issue o Lookup is likely bottleneck Possible solutions (read the book) o Partitioning and/or uniform placement of subnodes Chapter 4 Naming 65 Scalability Issues Uniform placement of subnodes??? Chapter 4 Naming 66 Unreferenced Entities What if entity is no longer referenced? o A distributed garbage collection problem Recall that for remote objects o State is remote o Client-side proxy o Server-side skeleton Assume that an object can be accessed only if a remote reference exists If no reference exists, then garbage Chapter 4 Naming 67 Unreferenced Objects Other type of garbage Chapter 4 Naming 68 Partial Solutions Reference counting o Keep running count of no. of references o When count reaches 0, remove object Reference listing o Skeleton maintains explicit list of proxies that know about it Tracing o Above methods do not deal with loops, etc. o Tracing: follow all paths from root Chapter 4 Naming 69 Reference Counting Keep track each time reference is added or deleted Easy in non-distributed systems But unreliable communication causes problems Chapter 4 Naming 70 Reference Counting If communication is unreliable o How to maintain accurate reference count? Chapter 4 Naming 71 Reference Counting a) b) Incrementing the counter too late (example of a race condition) Is this a solution to the problem ??? Chapter 4 Naming 72 Advanced Reference Counting Weighted reference counting o Object has a fixed total weight o And variable partial weight o Only decrement operation allowed Weight is split up (partial weights) when new references are created Delete: decrement by partial weight o Skeleton’s total weight is decremented Chapter 4 Naming 73 Advanced Reference Counting a) b) Initial assignment of weights Weight assignment for a new reference Chapter 4 Naming 74 Advanced Reference Counting c) Weight assignment when copying reference Chapter 4 Naming 75 Advanced Reference Counting Problems? Requires reliable communication o Ditto for reference counting Does not deal with loops o Ditto ditto Only a limited number of references o We can finesse this… Chapter 4 Naming 76 Advanced Reference Counting When partial weight has reached 1 o Use indirection to add more weight Chapter 4 Naming 77 Advanced Reference Counting Don’t like indirection? Fine! Generation reference counting (no weights) o Skeleton maintains array G o G[i] is number of copies at generation i o When deleting, send msg to skeleton Proxy’s generation no., k, and no. of copies made, n o Skeleton decrements its G[k] by 1 o Skeleton increments its G[k+1] by n o When all G[i] are 0, the object is deleted Chapter 4 Naming 78 Advanced Reference Counting Generation reference counting o Still requires reliable communication o Can add references w/o contacting skeleton Chapter 4 Naming 79 Tracing Reference counting does not help with loops and such One approach is mark and sweep o o o o Follow all paths from root Mark all places reached Then sweep through everything and… …remove everything not marked This can be done in distributed systems o But requires “stop the world” synchronization Chapter 4 Naming 80 Tracing More practical for dist. systems is o Tracing in groups Group is a collection of processes o For scalability Algorithm o o o o o Mark skeletons Propagate marks from skeletons to proxies Propagate marks from proxies to skeletons Iterate previous 2 steps on larger groups Garbage reclamation Chapter 4 Naming 81 Tracing in Groups Initial marking of skeletons Chapter 4 Naming 82 Tracing in Groups Final marking Chapter 4 Naming 83 Summary Naming is a serious issue! Types of names o Address o Identifier o Human-friendly Naming/naming graph Human-friendly is not mobile-friendly Chapter 4 Naming 84 Summary Mobility o o o o Broadcasting/multicasting Forwarding pointers Home location Hierarchical search tree Unreferenced objects o Reference counting o Tracing Chapter 4 Naming 85