Addressing: Flat Identifiers

advertisement
Flat Identifiers
Jennifer Rexford
Advanced Computer Networks
http://www.cs.princeton.edu/courses/archive/fall08/cos561/
Tuesdays/Thursdays 1:30pm-2:50pm
Outline
• Distributed Hash Tables (DHTs)
– Mapping key to value
• Flat names
– Semantic Free Referencing
– DHT as replacement for DNS
• Flat addresses
– Routing On Flat Labels
– DHT as an aid in routing
Distributed Hash Tables
http://pdos.csail.mit.edu/chord/
Hash Table
• Name-value pairs (or key-value pairs)
– E.g,. “Jen” and jrex@cs.princeton.edu
– E.g., “www.cnn.com/foo.html” and the Web page
– E.g., “BritneyHitMe.mp3” and “12.78.183.2”
• Hash table
– Data structure that associates keys with values
lookup(key)
key
value
value
Distributed Hash Table
• Hash table spread over many nodes
– Distributed over a wide area
• Main design goals
– Decentralization: no central coordinator
– Scalability: efficient even with large # of nodes
– Fault tolerance: tolerate nodes joining/leaving
• Two key design decisions
– How do we map names on to nodes?
– How do we route a request to that node?
Hash Functions
• Hashing
– Transform the key into a number
– And use the number to index an array
• Example hash function
– Hash(x) = x mod 101, mapping to 0, 1, …, 100
• Challenges
– What if there are more than 101 nodes? Fewer?
– Which nodes correspond to each hash value?
– What if nodes come and go over time?
Consistent Hashing
• Large, sparse identifier space (e.g., 128 bits)
– Hash a set of keys x uniformly to large id space
– Hash nodes to the id space as well
2128-1 0 1
Id space
represented
as a ring.
Hash(name)  object_id
Hash(IP_address)  node_id
Where to Store (Key, Value) Pair?
• Mapping keys in a load-balanced way
– Store the key at one or more nodes
– Nodes with identifiers “close” to the key
– Where distance is measured in the id space
• Advantages
– Even distribution
– Few changes as
nodes come and go…
Hash(name)  object_id
Hash(IP_address)  node_id
Nodes Coming and Going
• Small changes when nodes come and go
– Only affects mapping of keys mapped to the
node that comes or goes
Hash(name)  object_id
Hash(IP_address)  node_id
Joins and Leaves of Nodes
• Maintain a circularly linked list around the ring
– Every node has a predecessor and successor
pred
node
succ
Joins and Leaves of Nodes
• When an existing node leaves
– Node copies its <key, value> pairs to its
predecessor
– Predecessor points to node’s successor in the ring
• When a node joins
– Node does a lookup on its own id
– And learns the node responsible for that id
– This node becomes the new node’s successor
– And the node can learn that node’s predecessor
(which will become the new node’s predecessor)
How to Find the Nearest Node?
• Need to find the closest node
– To determine who should store (key, value) pair
– To direct a future lookup(key) query to the node
• Strawman solution: walk through linked list
– Circular linked list of nodes in the ring
– O(n) lookup time when n nodes in the ring
• Alternative solution:
– Jump further around ring
– “Finger” table of additional overlay links
Links in the Overlay Topology
• Trade-off # of hops vs. # of neighbors
– E.g., log(n) for both, where n is number of nodes
– E.g., overlay links 1/2, 1/4 1/8, … around the ring
– Each hop traverses at least half of the remaining
distance
1/2
1/4
1/8
Semantic-Free Referencing
(DHT as a DNS Replacement)
http://nms.lcs.mit.edu/projects/sfr/
Motivation for Flat Identifiers
Current
<A HREF=
http://isp.com/dog.jpg
>my friend’s dog</A>
Proposed
<A HREF=
http://f0120123112/
>my friend’s dog</A>
• Stable references
– Shouldn’t have to change when object moves
• Object replication
– Store object at many different locations
• Avoid fighting over names
– Avoid cyber squatting, typo squatting, …
Separate References and User-level Handles
User Handles
(AOL Keywords,
New Services, etc.)
Humanunfriendly
References
Object
Location
• Let people fight over handles
– Do not fight over references
– Allow multiple handle-to-reference services
• Flat identifiers
– Do not embed object or location semantics
– Are intentionally human-unfriendly
Semantic-Free Referencing
<A HREF=
http://f012c1d/
>Spot</A>
o-record
(10.1.2.3, 80,
/pics/dog.gif)
Managed
DHT-based
Infrastructure
orec
API
• orec = get(tag);
• put(tag, orec);
Anyone can put() or get()
10.1.2.3
/pics/dog.gif
Web Server
Resilient Linking
• Tag: abstracts object reachability information
• Object granularity: files, directories, hosts, …
10.1.2.3
<A HREF=
http://f012012/pub.pdf
>here is a paper</A>
/docs/
(10.1.2.3,80,
/docs/)(20.2.4.6,80,
SFR
/~user/pubs/)
20.2.4.6
/~user/pubs/
Flexible Object Replication
o-record
0xf012012
SFR
(IP1, port1, path1),
(IP2, port2, path2),
(IP3, port3, path3),
. . .
• Grass-roots replication
– People replicate each other’s content
– Does not require control over Web servers
(Doesn’t address massive replication)
Reference Management
• Requirements
– No collisions, even under network partition
– References must be human-unfriendly
– Only authorized updates to o-records
• Approach: randomness and self-certification
– tag = hash(pubkey, salt)
– o-record has pubkey, salt, signature
– Anyone can check if tag and o-record match
Reducing Latency
• Look-ups must be fast
• Solution: extensive caching
– Clients and DHT nodes cache o-records
– DHT nodes cache each other’s locations
Routing On Flat Labels
(DHT to Help in Routing)
How Flat Can You Get?
• Flat names
– DHT as a replacement for DNS
• Stable references, simple replication, avoid fighting
– Still route based on hierarchical addresses
• For scalability of the global routing system
• Flat addresses
– Avoid translating name to an address
– Route directly on flat labels
– Questions
• Is it useful?
• Can it scale?
Topology-Based Addressing
• Disadvantages: complicates
– Access control
– Topology changes
– Multi-homing
– Mobility
• Advantage
– Scalability
– Scalability
– Scalability
–…
Area 1
Area 2
1.2
1.1
K
B
2.1
2.2
Q
V
Area 4
Area 3
4.1
4.2
F
3.2
A
3.3
X
S
3.1
J
Routing on Abstract Graph: Know Your Neighbors
Virtual
topology
A
AX F
J
K
F
Q
S
V
X
K
V
J
S
F
J
K
K
Q
1. Write down
sorted list of IDs
2. Build paths
between
neighbors in list
V
S
X
J
Q
A
F
Network
topology
Routing on Abstract Graph: Forwarding Packets
Virtual
topology
A
X
F
K
V
J
S
K
Q
Q
F
J
Send(K,F)
K
V
S
X
J
Q
A
F
Network
topology
Routing on Abstract Graph: Stretch Problem
Virtual
topology
Resulting path length:
10 hops
A
X
F
V
V
J
S
X
J
K
K
Q
J
Q
V
S
X
Send(J,V)
F
A
A
Shortest path length:
3 hops
F
Network
topology
Routing on Abstract Graph: Short-cutting
Virtual
topology
Resulting path length:
4 hops
A
X
F
V
V
J
S
X
J
F
A
K
K
Q
J
V
S
X
Send(J,V)
Q
A
Shortest path length:
3 hops
F
Network
topology
Identifiers
• Identity tied to public/private key pair
– Everyone can know the public key
– Only authorized parties know the private key
• Self-certifying identifier: hash of public key
• Host associates with a hosting router
– Proves it knows private key, to prevent spoofing
– Router joins the ring on the host’s behalf
• Anycast
– Multiple nodes have the same identifier
Basic Mechanisms behind ROFL
• Goal #1: Scale to Internet topologies
– Mechanism: DHT-style routing, maintain sourceroutes to successors (fingers)
– Provides: Scalable network routing without
aggregation
• Goal #2: Support for BGP policies
– Mechanism: Intelligently choose successors
(fingers) to conform to ISP relationships
– Provides: Support for policies, operational model
of BGP
How ROFL Works
4. intermediate
routers may cache
pointers
2. hosting routers
participate in ROFL on
behalf of hosts
0x3F6C0
Pointer cache:
0x3B57E
ISP
0x3BAC8
ISP
0x3B57E
Pointer list:
Successor
0x3F6C0 list:
0x3F6C0
0x3BAC8
0x3BAC8
0x3F6C0
5. external pointers
provide reachability
across domains
0x3B57E
(joining
host)
3. hosting routers maintain
pointers with source-routes
to attached hosts’
successors/fingers
0xFA291
1. hosts are assigned
topology-independent
“flat” identifiers
Internet Policies Today
hierarchy #1
hierarchy #2
hierarchy #3
provider routes must not
be exported to peers
prefer customer
over peer routes
peer link
Source
Destination
• Economic relationships: peer, provider/customer
• Isolation: routing contained within hierarchy
Isolation in ROFL
External
Successor
Joining
Source
host
Internal
Successor
External
Destination
Successor
 Traffic between two hosts traverses no higher than their
lowest common provider in the AS hierarchy
Discussion
• How flat should the world be?
– Flat names vs. flat addresses?
• What should be given a name?
– Objects?
– Hosts?
– Networks?
• What separation to have?
– Human-readable names
– Machine-readable references
– Network location
Download