A Generic Resolution Architecture to Support Arbitrary Naming Schemes: A Focus on Mobility Eunsang Cho, Jaeyoung Choi, Ted “Taekyoung” Kwon, Yanghee Choi MMLAB, Seoul National University, Korea AsiaFI 2011 Summer School (Aug. 9th, 2011) Outline • • • • • • Background Idea Resolution Hierarchical Indirection Table (HIT) Numerical Results Conclusion MMLAB, SNU 2 Motivation (1/2) • Why a novel resolution architecture is required? – New services may need to name different kinds of entities. • Physical RFID Tag Electronic product code (EPC) • Digital File Digital object identifier (DOI) – Entity’s location can be changed. – We expect that future Internet applications and services will require diverse naming schemes for physical and digital entities that may dynamically move. MMLAB, SNU 3 Motivation (2/2) • Why a generic resolution architecture is required? – Since we cannot know what kinds of naming schemes will be used in the future, we propose a generic resolution infrastructure. – Specifically, we focus on how to locate mobile entities. • A digital file moving from one computer to another • A contact point of a moving user with a portable device • Taxis that are being tracked MMLAB, SNU 4 Prior Works • Domain name system (DNS) – Basically to locate quasi-static TCP/IP connectivityconfigurable hosts • Prior mobility solutions (e.g. Mobile IP) – Most solutions provide mobility transparently to legacy hosts using an agent. – The agent relays all the traffic from/to the mobile host. Traffic burden on the agent MMLAB, SNU 5 Idea (1/2) • Our standpoint is to make “an entity” aware of its mobility. – More precisely, a principal (or a purveyor) of the entity will be notified of the locator change, and it will inform a name-to-locator resolution system of the change. • The central notion of our architecture is to split the name of an entity and the locator of the host that holds the entity. – Applications and their developers do not need to know mobility at all. MMLAB, SNU 6 Idea (2/2) • How to resolve the current locator of a mobile entity from its name? – We propose a novel structure, a hierarchical indirection table (HIT), to keep track of the current locators of mobile entities. – We assume that relatively static entities are handled by a caching mechanism similar to the DNS, and instead focus on mobile entities. MMLAB, SNU 7 Resolution (1/2) • Route-by-name – A router should have the routing entry to forward packets by the entity's name. • If the names of entities are not aggregatable, the routers should have as many routing entries as the entities. • Since we aim to support arbitrary naming schemes in the future Internet, names may not be aggregated, and hence the route-by-name option has a huge scalability problem. MMLAB, SNU 8 Resolution (2/2) • Lookup-by-name – Names and their locators are split, and routers just need to forward packets by locators. • Typically locators are aggregatable to a certain degree. • This approach requires an additional resolution infrastructure. • If the name-locator mapping is almost static (like the DNS), the resolution overhead will be significantly mitigated by caching. • However, if the locators are changing due to the mobility of entities, it becomes a challenging issue, which is the main topic of this research. MMLAB, SNU 9 Mapping Structures • DNS – Flexible tree structure Short round-trips – Load balancing issue • Distributed hash table (DHT) – Chord: an overlay ring structure with equal responsibility among nodes – No or little load balancing issue • ViAggre – Designed to reduce the routing entries in default-free routing zones (DFZs) – The entire IP address space is divided into a set of routers, each of which is responsible for a partition of the address space. MMLAB, SNU 10 Proposed Resolution • Considerations & assumptions – Diverse naming schemes • Hierarchical, flat, or attribute-value pairs • We use a hash function that maps a name into a fixed length value. We call the hashed value as an identifier. – Lookup-by-name approach • Routers can forward packets by their locators without scalability issues. • Names are used only at application layer, and will be converted to a hashed value at the transport layer. MMLAB, SNU 11 Identifier-to-Locator (I2L) Steps Step 3: Router-level AS b I2L (RI2L) server • Two types of I2Ls – AS- and Router-level AS a Step 1: Mapping Query Step 2: Using AS-level I2L (AI2L) Step 4: Locator Entity B Step 5: Packet Router Point of Attachment Entity A MMLAB, SNU (User/Host) Border Router Mobile Entity 12 Hierarchical Indirection Table (HIT) • How can we construct the AI2L and RI2L structures? – As the RI2L structure is expected to have various structuring options, we tackle how to construct AI2L from globally distributed servers. • The design objectives of the AI2L structure – To reduce the response time of mapping queries while minimizing additional routing memory requirements – To expedite the query response, we make routers forward mapping query packets just like ordinary packets without any overlay or additional processing overhead. MMLAB, SNU 13 Router in HIT • Two router functions – Resolving: relaying mapping query packets to next hop routers by looking at identifiers – Forwarding: relaying ordinary packets to next hop routers by looking at locators • Thus, each router has two kinds of memories – Forwarding table – Resolving table MMLAB, SNU 14 Topology-aware Routing & Indirection • The routing scalability in the current Internet has already become a serious issue. – Due to site multi-homing and traffic engineering purposes • Since our design has the extra memory requirement of the resolving table, we seek to reduce the resolving table size. • To this end, we combine topology-aware routing and indirection among routers as follows. MMLAB, SNU 15 HIT (1/3) • There are T trees, and each root is responsible for the identifier space of (2L)/T . (L is bit length of hash value.) – The root nodes have no actual mapping entries. – They merely “resolve” incoming mapping query packets to the corresponding next hop nodes in its tree. – That is, servers are end systems that have the mapping entries between identifiers and their locators, while nodes in a virtual tree are routers that relay mapping query packets to the corresponding servers. MMLAB, SNU 16 HIT (2/3) • Leaf nodes in a tree correspond to the routers to which servers are connected. • Suppose L and T are 40 and 16 for the sake of exposition. – The 1st root node: 0x000000000 to 0x0FFFFFFFFF – The 2nd root node: 0x1000000000 to 0x1FFFFFFFFF – The last root node: 0xF000000000 to 0xFFFFFFFFFF – These root nodes advertise the prefixes of their assigned spaces throughout the Internet. – In this example, like CIDR, these root nodes will each announce their prefixes (i.e. the first four bits) 0b0000, 0b0001, and 0b1111, respectively. • Thus, each router needs T entries in the resolving table to relay a mapping query packet to its corresponding root node. MMLAB, SNU 17 HIT (3/3) • The number of child nodes of each node in a tree is W. – As there are W first level nodes for each root node, each first level node is in charge of (2L)/T/W. • Suppose W is 4 in the above tree example. – Then the four child (first level) nodes of the first root node will advertise 0b000000, 0b000001, 0b000010, and 0b000011 for K hops, respectively. – Likewise, the four child nodes of the second root node will advertise 0b000100, 0b000101, 0b000110, and 0b000111 for K hops, respectively. MMLAB, SNU 18 Resolving Table Size & Lookup Delay (1/2) • We impose a constraint that a parent node and its child node are located within K (router level) hops. – The design rationale behind this spatial constraint is to reduce the resolving table size as well as the lookup delay. – Thus, each first level node is within K hops from the root. – Likewise, each second level node will be reached within K hops from its parent (first level) node. • The depth of each tree D can then be found by the minimum value that satisfies T*WD ≥S. • The expectation of the number of the hop count between N routers is given by logN. MMLAB, SNU 19 Resolving Table Size & Lookup Delay (2/2) • Thus, ignoring the delay between the soliciting host/server and its access router, the average query response in terms of the hop count is the sum of followings. Soliciting Host logN – The leg between the soliciting host and the corresponding root (logN) – The worst case hop count from the root to the server that has the requested mapping entry (D*K) – The returning leg from the server to the soliciting node (logN) • • Even though we assume K is decided a priori, the nodes may flexibly set K depending on the topological distribution of nodes. Corresponding Root D*K logN Server with Mapping Entriy If we assume that individual trees are not overlapping between one another, each router needs to keep up to T+D*W entries in its resolving table. MMLAB, SNU 20 Load of Mapping Query • The major concern of the HIT is the load of mapping query packets on roots. – In the worst case, each root should handle approximately 1/T of total mapping query packets. – However, the actual traffic load can be substantially mitigated by locating its child (and descendant) nodes as “the first line of defense.” – Figure illustrates an extreme case in which the root node receives no incoming mapping query packets since the packets encounter the root's child nodes (i.e. first level nodes) and are relayed to their descendant nodes. – The current top-notch routers can forward about a billion packets per second. – We believe the combination of sufficiently high T, placing descendant nodes closer to backbone links, and the ever increasing line speed of routers can solve the workload scalability issue of the HIT. MMLAB, SNU 21 Numerical Results • We asymptotically compare the HIT with other mapping structures in terms of: – number of resolution entries – lookup cost – workload on a mapping server • For the HIT, we set K, W, and T as 4, 8, and 256 respectively. – As for T, we consider that the DNS has more than two hundred machines for root servers in 2010. – However, we believe this number should be raised as the number of mobile entities increases. MMLAB, SNU 22 Number of Resolution Entries • The number of resolution entries represents how many routing entries need to be maintained in a resolution node to provide AI2L mapping services. (y-axis: log-scale) Averaged # of child nodes of .com and .net in 2010 Linearly increased (S ) MMLAB, SNU Logarithmically increased Chord: logS HIT: T+W*D 23 Lookup Cost • The lookup cost in terms of the hop count as the number of routers N increases, where S is set to 105 (x-axis: log-scale) Mapping query goes back and forth logS times. (logS*logN+log N) e.g. www.snu.ac.kr (2logN*5) Leaf node is not far from root (D*K+2logN) Shortest path (2logN) MMLAB, SNU 24 Worst-case Workload • The worst-case workload on a node for each generated mapping query as S increases. (y-axis: log-scale) High traffic load on root (1/T ) Flat structures Chord: (logS)/S ViAggre: 1/S MMLAB, SNU 25 Conclusion • Diverse services can be well accommodated by making the future Internet capable of supporting their naming schemes naturally. • To this end, we suggest a generic resolution architecture in which a name at application layer is converted into its identifier at transport/network layer by a hash function. – Since applications may seek to name mobile entities (either physical or digital), we propose a hierarchical indirection table (HIT) to provide mapping from the identifier of an entity to the locator of the entity. – The central idea of the HIT is to construct virtual trees among routers to relay mapping query packets toward the corresponding mapping servers. – By imposing a spatial constraint that a parent node and its child node must be less than a preconfigured hop count away, the HIT can minimize the number of resolution entries while reducing the lookup delay. – We believe the traffic load on the root nodes of the trees can be mitigated by the combination of constructing a sufficient number of trees, placing descendant nodes closer to backbone links than ancestor nodes, and the ever increasing line speed of routers. MMLAB, SNU 26 THANK YOU MMLAB, SNU 27 Discussions (1/2) • Some critical concerns about the HIT can be raised, e.g. – Who will deploy and maintain so many servers – Once servers are deployed, how can we satisfy the spatial constraints • • i.e. parent and child nodes should be within K hop count. We argue that there are possible solutions. – For instance, the top internet naming agency (i.e. ICANN) or regional internet registries (RIRs) may deploy and maintain mapping servers in numerous Internet service providers (ISPs). – ISPs can provide the Internet connectivity for the servers in exchange for IP addresses and other number of resources assigned from ICANN and RIRs. – The maintenance fee can be levied on principals that register the identifiers of the names. – We believe it is also feasible to allow private companies (e.g. ISPs) to operate mapping servers with the same business model. – If ISPs deploy servers, they can easily satisfy the spatial constraint. – Due to the proliferation of content delivery networks (CDNs), leveraging the widely distributed infrastructures of CDN companies can be an attractive option. Discussions (2/2) • Our resolution architecture takes a clean slate approach since it requires a novel router architecture and the advertisement of (identifier space) prefixes of resolution nodes. • However, we can think of an overlay version of the HIT, like a DHT, that need not modify legacy routers. – In the HIT overlay, a soliciting host (or its resolver) should know the IP addresses of T root nodes. – We believe that it is not a huge burden since the root nodes will be highly stable and available. – A mapping query packet is first sent to its corresponding root, and then relayed from the root to the corresponding mapping server along the virtual tree path by tunneling. – That is, the destination locator is set to be an IP address of the next resolution node for each tree link. – Therefore, we can deploy the HIT without any changes on routers. – However, the tunneling overhead in the root nodes will be a critical issue since tunneling a packet will take longer time than merely relaying a packet to the next hop router in the original HIT.