Naming Chapter 4

advertisement
Naming
Chapter 4
Names, Addresses, and Identifiers
•
•
•
•
•
•
•
Name: String (of bits/characters) that refers to an entity (e.g. process, file, device, …)
Access point: Each entity has an access point that allows for communication with that entity.
Address: An access point is also an entity and has a name, the so-called address.
Access point/Entity: n-to-n relationship
a) person (i.e. entity) with different telephone sets (i.e. access points).
b) different entities may share a single access point.
Reason for separating names and addresses: flexibility
e.g. after code migration the address of a server would change but not its name, no
invalidation of references is needed!
 e.g. if entity has different access points (e.g. horizontal web server organization), a
single name may be used (for different addresses!).
 in general: access-point-to-entity mapping is independent from used names.
Identifier: A name that uniquely identifies an entity:
1. At most one entity per ID.
2. At most one ID per entity.
3. IDs are never reused.
The nice feature of IDs is that the test of identity becomes a test of ID- equality!
Human-friendly names: rather meaningful character strings.
Name Spaces (1)
Two (path) names
for a single entity
 Hard link alias
A general naming graph with a single root node.
Name space: reflects the structure of the naming scheme (e.g. graph, tree, …)
Name resolution: process of mapping a name to its corresponding entity. Initiating
name resolution (e.g. begin with root) is known as closure mechanism.
Examples: n1:<steen, keys> would return content of n5.
n0: <home> would return table stored in n1.
Name Spaces (2)
Free blocks, free inodes,…
OS initial routine
Address of file on disk,
access rights,
last time modified, …
The general organization of the UNIX file system implementation on a logical disk of
contiguous disk blocks.
– Inodes (index nodes) are numbered from 0 (for the root) to some maximum.
– Directory nodes are implemented like file nodes.
Linking and Mounting (1)
Resolution continues
from n0
The concept of a symbolic link explained in a naming graph.
– Alias: Another name for same entity (e.g. /home/steen/keys = /keys).
– Alias implementations:
• Hard links: Allow multiple incoming edges for a node (see slide Name Space (1)).
• Symbolic links: Tree structure with more information in referencing nodes (see above).
Linking and Mounting (2)
Mount
point
Mounting
point
Mounting remote name spaces through a specific process protocol.
Resolution of /remote/vu/mbox:
1.
2.
Local resolution until node /remote/vu.
Use of NFS protocol to contact the server flits.cs.vu.nl in order to access foreign directory
/home/steen.
Linking and Mounting (3)
Organization of the DEC Global Name Service
Name Space Distribution (1)
availability
performance
An example partitioning of the DNS name space, including Internet-accessible
files, into three layers.
Caching: for performance and availability
Name Space Distribution (2)
Item
Global
Administrational
Managerial
Geographical scale of network
Worldwide
Organization
Department
Total number of nodes
Few
Many
Vast numbers
Responsiveness to lookups
Seconds
Milliseconds
Immediate
Update propagation
Lazy
Immediate
Immediate
Number of replicas
Many
None or few
None
Is client-side caching applied?
Yes
Yes
Sometimes
A comparison between name servers for implementing nodes from a
large-scale name space partitioned into a global layer, as an
administrational layer, and a managerial layer.
Implementation of Name Resolution (1)
The principle of iterative name resolution.
+: less performance requirements for name servers
-: effective caching only at client
-: may induce high communication overhead
Implementation of Name Resolution (2)
The principle of recursive name resolution.
-: high performance demands on name servers
+: more effective caching possible (in different places)
+: high availability due to cached information (e.g. #<vu> down  use #<vu,cs>)
+: may reduce communication costs
Implementation of Name Resolution (3)
Server for
node
Should
resolve
Looks up
Passes to
child
Receives
and caches
Returns to
requester
cs
<ftp>
#<ftp>
--
--
#<ftp>
vu
<cs,ftp>
#<cs>
<ftp>
#<ftp>
#<cs>
#<cs, ftp>
nl
<vu,cs,ftp>
#<vu>
<cs,ftp>
#<cs>
#<cs,ftp>
#<vu>
#<vu,cs>
#<vu,cs,ftp>
root
<nl,vu,cs,ftp>
#<nl>
<vu,cs,ftp>
#<vu>
#<vu,cs>
#<vu,cs,ftp>
#<nl>
#<nl,vu>
#<nl,vu,cs>
#<nl,vu,cs,ftp>
Recursive name resolution of <nl, vu, cs, ftp>. Name servers
cache intermediate results for subsequent lookups.
(Assumption: name servers hand back to caller more than one result)
Implementation of Name Resolution (4)
e.g. Europe
e.g. America
The comparison between recursive and iterative name
resolution with respect to communication costs.
The DNS Name Space
Type of
record
Associated
entity
Description
SOA
(start of
authority)
Zone
Holds information on the represented zone
e.g. administrator email address, host name for this zone
A
(address)
Host
Contains an IP address of the host this node represents
if multiple, then multiple A records
MX
(mail exchange)
Domain
Refers (symbolic link) to a mail server to handle mail addressed to this node
e.g. domain (i.e. subtree) cs.uleth.ca may have machineX.cs.uleth.ca (mutiple allowed)
SRV
(Server)
Domain
Refers to a server handling a specific service
e.g. http.tcp.cs.vu.nl for a Web server
NS
(name Server)
Zone
Refers to a name server that implements the represented zone
(the node is a node representing a zone)
CNAME
(canonical name)
Node
Symbolic link with the primary name of the represented node
canonical name = primary name of an entity (host)
PTR
(pointer)
Host
Contains the canonical name of a host
e.g. to allow inverse mapping (IP to name), node 11.24.37.10.in-addr.arpa would store cname.
HINFO
(host info)
Host
Holds information on the host this node represents (e.g. what OS, architecture, …)
TXT
(text)
Any kind
Contains any entity-specific information considered useful
The most important types of resource records forming the contents of nodes in the
DNS name space.
DNS Implementation (1)
3 name servers for zone
An excerpt
from the
DNS
database
for the
zone
cs.vu.nl.
3 mail servers (priority!)
this name server has
2 addresses (reliability!)
backup for this mail server
symb. links to same host
host for FTP and Web
3 name servers for zone
a laser printer
IP to cname mapping
cs.vu.nl
domain
star
zephyr
ftp
www
soling
vucs
laser
DNS Implementation (2)
vu.nl
cs
ee
cs.vu.nl
domain
Name
Record type
Record value
cs.vu.nl
NS
solo.cs.vu.nl
solo.cs.vu.nl
A
130.37.21.1
Part of the description for the vu.nl domain
which contains the cs.vu.nl domain.
Naming versus Locating Entities
Problem: What to do if an entity is moved?
Examples:
a) within same domain: ftp.cs.v.nl  ftp.research.cs.vu.nl
Solution: local update of DNS database (efficient)
b) to different domain: ftp.cs.v.nl  ftp.informatik.unibw-muenchen.de
2 Solutions: (1) update address in local DNS DB
 updates become slower when moved again (no more local)
(2) use symbolic links
 lookups become slower
 Solutions unsatisfactory especially for mobile entities, which change their locations often!
NS maps names to IDs
NS maps names to addresses
LS maps IDs to addresses
promotes mobility
a) Direct, single level mapping between names and addresses. b) 2-level mapping using identities.
Forwarding Pointers (1)
Examples of location services in LANs:
a) Address Resolution Protocol (ARP): broadcasts an IP address (i.e. ID) and host returns its
(data link layer) address (e.g. Ethernet address)
b) Multicast to locate laptops: Laptop is assigned a dynamic IP address and is member of a
group. Host detects a laptop by multicasting a message containing its ID (e.g. computer name)
and receiving its current IP address.
Forwarding pointers to (mobile) distributed objects: Chain of pointers (i.e. [proxy, skeleton] pairs).
Old object location
New object location
The principle of forwarding pointers using (proxy, skeleton) pairs.
Forwarding Pointers (2)
First request
Subsequent requests
Skeleton S
Redirecting a forwarding pointer, by storing a
shortcut in a proxy.
For efficiency, only the first request goes through the (current) chain.
Skeleton S no longer referred to! (Garbage Collection needed)
Problem: broken chains!
Home-Based Approaches
Wants to
communicate
with mobile
host A
Host A
The principle of Mobile IP.
 The mobile host (A) has a fixed IP address (an ID) and registers its dynamic IP address with a
home agent (running at its home location)
 Problems: a) home agent may be far from client while host is in its proximity!
b) host may move and stay for a long time at new location. Better in this case to
move home agent, as well.
Hierarchical Approaches (1)
HA3
Alternative:
(hierarchical)
solution
Home agent (HA)
1.
HA2
Client
HA1
2.
Mobile host
Hierarchical organization of a location service into domains, each having a directory node.
Main idea: exploit locality, e.g. in mobile telephony, first the phone is looked up in a local network,
and then a request is sent to home agent.
Hierarchical Approaches (2)
M is the node for the smallest sub-domain
containing the two replicas.
 Two pointers are needed here!
Replica 1
Replica 2
An example of storing information of an entity having two addresses in
different leaf domains.
 Used e.g. for replicated entities.
Hierarchical Approaches (3)
Looking up a location in a hierarchically organized location service.
 Lookups work bottom-up (locality!)
Hierarchical Approaches (4)
Created entity E
(in this case a replica)
a)
An insert request is forwarded to the first node that knows about entity E.
b)
A chain of forwarding pointers to the leaf node is created.
Chain of forwarding pointers may be created bottom-up (when traversing the node)
for efficiency and availability.
Pointer Caches (1)
Caching a reference to a directory node of the lowest-level domain in
which an entity will reside most of the time.
 It does not make sense to cache entity addresses (since they are changing regularly).
 Addresses of sub-domains, where entity is assumed to be, are cached instead.
Pointer Caches (2)
X
Replica
A cache entry that needs to be invalidated because it returns a nonlocal address, while
such an address is available.
 For efficiency, insertion yields cache invalidation, if a replica is constructed.
 Scalability remains a challenging problem. E.g. root node must store references to every
entity in the network ( bottleneck).
Possible remedies: use of parallel machines and/or distributed servers implementing the
root (and high-level nodes).
The Problem of Unreferenced Objects
An example of a graph representing objects containing
references to each other.
Reference Counting (1)
Immediately after
installation proxy p
sends message (+1)
to skeleton.
The problem of maintaining a proper reference count in the presence
of unreliable communication.
 Detection of duplicates is necessary (and easy, e.g. use of message identifiers).
 Same problem arises when deleting a remote reference (i.e. decrementing).
Reference Counting (2)
-1
a)
b)
-1
Copying a reference to another process and
incrementing the counter too late
A solution.
Unallowed delete (proxy will deny that, because no ACK has been received yet).
Allowed delete.
Advanced Referencing Counting (1)
a)
b)
c)
The initial assignment of weights in weighted reference counting.
Weight assignment when creating a new reference.
Weight assignment when copying a reference (P2 does not need to contact server (no
+1 messages); when deleting a reference, decrement/increment is sent to server).
Advanced Referencing Counting (2)
Creating an indirection when the partial weight of a reference has reached 1.
Problem with “weighted” RC: a priori known maximum number of references (total)!
Solution: redirection (like forwarding pointers, see above).
Other methods: reference lists instead of reference counters; skeleton holds a list of all proxies
using it (used in Java RMI).
+: more robust against duplication (delete/insert are idempotent).
-: non-scalable (all proxies in a skeleton list!); solution: proxies should
re-register after some time with skeleton.
DGC ALGORITHM
Process
(on a machine)
Group of processes
(on different machines)
1.
2.
3.
4.
5.
Initial marking:
Mark skeletons that are accessible from the outside “hard”,
rest is marked “soft”. Mark all proxies “none”.
Intra-process mark propagation:
If a proxy is accessible from a local skeleton that is marked
“hard” or from the outside, mark it “hard”.
If a proxy is accessible from a local skeleton that is marked
“soft”, mark the proxy “soft” iff it has not been marked
“hard” before.
Inter-process mark propagation:
Any skeleton that is still marked “soft” is marked “hard”, if it
is accessible from a proxy that is marked “hard”.
Stabilization:
Repeat steps 2 and 3 until no marks can be propagated.
Garbage collection:
Remove “unreferenced objects” (proxies marked “none” or
“soft”, skeletons marked “soft” and their corresponding
objects).
DGC: Distributed Garbage Collection
Proxy
(initially unmarked)
Skeleton
(initially unmarked)
Reference from another
group (or root entities)
Reference from a proxy
to a remote skeleton
Reference from a
skeleton to a
local proxy
Proxy marked
“hard”
Proxy marked
“soft”
Proxy marked
“none”
Skeleton marked
“hard”
Skeleton marked
“soft”
Legend
Step 2313Step
(Repetition
5
3)
Step
(Repetition
1)
3)
2)
Remove
NO PROPAGATION!
unreferenced
(Step objects
4 is reached)
Download