A UNIFIED APPROACH TO LOOP-FREE ROUTING USING DISTANCE VECTORS OR LINK STATES J.J. Garcia-Luna-Aceves Network Information Systems Center SRI International 333 Ravenswood Avenue Menlo Park, California 94025 garcia@sri.com entire network topology, or at least receive that information, to compute the shortest path to each network destination. Each node broadcasts update messages, each containing the state of each of the node’s adjacent links, to every other node in the network. Abstract We present a unified approach for the dynamic computation of shortest paths in a computer network using either distance vectors or link states. We describe a distributed algorithm that provides loop-free paths at every instant and extends or improves algorithms introduced previously by Chandy and Misra, Jaffe and Moss, Merlin and Segall, and the author. Our approach treats the problem of dis- Over the past few years, there has been a great deal of debate as to which type of routing algorithm is better suited for large networks and networks with dynamic topologies and varying traffic. As Schwartz points out [SCHW-861, deciding between one or the other is highly dependent on the specific network in which the algorithm will operate. The primary concern with the distance-vector algorithms that have been used in the past of routing-table loops and counting to infinity is the formation [JAFF-821 [GARC-891. A routing-table loop is a path specified in the nodes’ routing tables at a particular point in time that visits the same node more than once before reaching the intended destination. A node counts to infinity when it increments its distance to a destination until it reaches a predefined maximum distance value. There have been a number of attempts to solve the counting-teinfinity and routing-table looping problems of distance-vector algorithms by increasing the amount of informs tion exchanged among nodes [CEGR-751 [GARC-871 [NAYL-751 [SHIN-871 [STER-801. However, as discussed by the author elsewhere [GARC-891, none of these approaches solves these problems satisfactorily. On the other hand, link-state algorithms are free of counting-to-infinity. However, they need to maintain an tributed shortest-path routing as one of diffusing computations, which was first proposed by Dijkstra and Scholten. We verify the loop-freedom of the new algorithm, and also demonstrate that it converges to the correct routing en- tries a finite time after an arbitrary sequence of topological changes. We analyze the complexity of the new algorithm when distance vectors and link states are used, and show that using distance vectors is better routing overhead is concerned. 1 Introduct insofar as ion All shortest-path routing algorithms used today in computer networks can be classified as distance-vector or link-state algorithms. In a distance-vector algorithm, a node knows the length of the shortest path from each neighbor node to every network destination, and uses this information to compute the shortest path and next node in the path to each destination. A node sends update messages to its neighbors, who in turn process the messages and send messages of their own, if needed. Each update message contains a vector of one or more entries, each of which specifies the distance to a given destination, plus some other information (e.g., the successor in the path [CEGR-751 or the entire path [SHIN-871). In contrast, in a link-state algorithm (also called topology-broadcast algorithm [JAFF-861) a node must know the up-to-date version of the entire network topology at every node, which may constitute excessive storage and communication overhead in a highly dynamic network.’ It is also interesting to note that the link-state algorithms proposed or implemented to date do not eliminate the creation of temporary routing-table loops, which can be created whenever two or more nodes recompute shortest paths using inconsistent views of the network topology. Whether ‘This work was supported by SRI International IRED funds, by the U.S. Army hearch OfJk under Contract DAALO3-88-K-0054, and by Rome Air Dev&pment Center and the Defense Advanced Research Projects Agency under Contract F30602-69-C-0015. link states or distance vectors are used, it is clear that the existence of routing-table loops, even when these are temporary, is a detriment to the overall network performance. In this paper, we present a unified approach to the solution of the counting-to-infinity and routing-table looping problems in distributed routing algorithms, and describe mechanisms for routing-table loop freedom that are independent of whether distance vectors or link states are used, the metric used to measure internodal distances, or the cost function used to compute shortest paths. We want to emphasize that the loop freedom we are Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 0 1989 ACM 089791-332-9/89/0009/0212 $1.50 ‘The interested reader may want to consult Seeger and Khanna [SEEG-861 for a detailed description of the routing overhead problem associated with a particular link-state algorithm in the defense data network (DDN), which is a large network. 212 interested in achieving is with respect to the entries specified in the nodal routing tables. Throughout this paper, we refer to routing-table loops simply as loops, and refer to a routing algorithm that is free of routing-table loops as a loop-free routing algorithm. We treat the distributed shortest-path routing problem as one of diffusing computations, a concept that was first proposed by Dijkstra and Scholten [DIJK-801. Our results generalize or improve on previous results on loop-free routing algorithms by the author [GARC-88a] and others [CHAN-821 [MERL-791 [JAFF-821. These results can help network designers decide between using link states or distance vectors without having to worry about the existence of long-lasting or temporary loops. Furthermore, our results on loop freedom can serve as the basis for the development of new, effective internet routing and reachability algorithms. In Section 2, we describe distributed loop-free routing as a problem of diffusing computations and summarize existing results. In Section 3, we present the new algorithm for the coordinated propagation of information among nodes, and describe how distance vectors and link states can be used. In Section 4, we present examples of the operation of the new algorithm. In Section 5, we demonstrate that our algorithm provides loop-free paths at every instant and that it converges to correct routingtable values in a finite time after an arbitrary sequence of topological changes. In Section 6, we compare the use of distance vectors versus link states, and demonstrate that distance vectors are better, insofar as overall routing overhead is concerned. In Section 7, we summarize performance improvements on the basic algorithm of Section 3. We conclude in Section 8 with a discussion of further design and implementation issues raised by our new approach. 2 Loop-Free Routing as a Problem Diffusing Computations-Previous of Work The fact that temporary loops do occur in existing link-state algorithms shows that looping in distributed routing algorithms is independent of the amount of topology information maintained in the nodal tables and results from allowing nodes to carry out updates in their local tables without sufficient coordination among them. A number of algorithms have been proposed in the past to accomplish loop-free shortest-path routing through internodal coordination. The algorithm by Merlin and Segall [MERL-791 is based on the propagation of information with feedback along a tree rooted at a network destination and spanning all network nodes. A major limitation of this algorithm is that the destination node (or root of the tree), or the node that first detects a change in distance to that destination, must start each generation of the propagation of updates, where each generation reaches an additional hop of the tree. This creates substantial communica tion overhead [SCHW-861. nations. The diffusing computation grows by sending queries and shrinks by receiving replies among the nodes participating in the diffusing computation. The use of diffusing computations assumes the existence of a link-level protocol that assures that a node is always aware of who its neighbor nodes are, all messages exchanged between two nodes are received correctly and in the proper sequence within a finite time, and each node processes input events (messages, notifications regarding the failure or acquisition of neighbors, or notifications regarding the cost of adjacent links) one at a time, on a first-in/first-out (FIFO) order, and within a finite time. To apply this concept to routing, we first observe that, regardless of whether distance vectors or link states are used, each nodal routing table must contain, as a minimum, the length and successor (i.e., next node) of the shortest path to each destination. We model a network as an undirected connected graph in which each link has two lengths (or costs) associated with itone for each direction-and any link of the network exists in both directions at any one time. We assume that each node (active or inactive) has a unique node identifier. We denote the set of nodes adjacent to node i as Ni and call each node in that set a neighbor of node i. Link costs are considered to vary in time an are assumed to be always positive. The distance between two nodes is measured as the sum of the link costs in the path of least cost (called shortest path) between them. For the moment, we can assume that each node mdntains a topology table with sufficient information to compute the values of the node’s routing table locally. Accordingly, for each destination j, all the nodal routing tables in a connected network G define a graph whose nodes are the same nodes of G, and which has a directed edge from node i to node k if and only if node k is the successor of node i toward destination j, denoted 4. Obviously, loop freedom is guaranteed at all times in G if this graph is always a directed acyclic graph, which we call thesuccessor directed acyclic graph (SDAG) of G for destination j and denote it by S,(G). In steady state, when all routing tables are correct, Sj(G) must be a tree. The directed path from node i to node k in S,(G) at time t is denoted by Pik(t). We say that node i is upstream of node k in Sj(G) if the directed chain Pij from node i to node j includes node k. Similarly, we say that node k, in this case, is downstream of node i. A loop can occur in G only when Sj(G) is modified for some j E G after a change in the cost or status of a link. Furthermore, multiple such changes can occur. For each link-cost or link-status change that affects Sj(G), we define an SDAG computation, which consists of obtaining the new SDAG formed by the routing-table entries for node j at each node in the network. Using Dijkstra and Scholten’s algorithm (DSA), a single SDAG computation can be diffused to (i.e., propagated among and executed by) nodes along the SDAG using queries that contain the news about the topological change; the diffused SDAG computation then shrinks along the same tree with the reception of replies to those queries, which specify that the nodes sending the replies have modified their routing tables according to the change(s) reported in queries. Initially, all nodes are passive, that is, not engaged in a diffused SDAG computation, When a node detects a change in the status or cost of an adjacent link, a diffused SDAG computation commences with the node going into active An alternative approach to the propagation of information with feedback used by Merlin and Segall is to view distributed loop-free routing as a problem of difising computations. Dijkstra and Scholten [DIJK-801 introduced the concept of diffusing computations to check the termination of a distributed computation, such that the process starting a computation is informed when it is completed and such that there are no false termi- 213 state by sending a query to all its neighbors upstream in the SDAG. After receiving and processing a query from its successor toward the destination, a passive node forwards the query to its neighbors upstream in the SDAG and goes into active state. A node in active state cannot change the successor to the destination listed in its routing table (i.e., it cannot change the SDAG) until it receives all the replies to its query; when this happens, the node goes back to passive state and computes a new successor and distance. Accordingly, all the neighbors of an active node must receive the new distance from that node to the destination and reply with their own new distances before the node can recompute its distance and successor to the destination. This means that a node can never choose a neighbor upstream in the SDAG when it must change its successor to the destination. Because this adaptation of DSA supports only one diffused SDAG computation at any time, it operates correctly only for the case in which a single change occurs in a network that has a stable topology. tion for node j and sends a query to ail its neighbors. The node cannot change its successor to node j until it has been told by all its neighbors that they have processed its query and have either obtained alternative feasible successors to node j, or determined that they cannot reach node j. Jaffe and Moss’s bit vectors are used to support multiple diffused SDAG computations concurrently. The key advantage of using diffusing computations over the approach used by Merlin and Segall is that the destination node does not have complete control on the propagation of information, which accounts for a substantial reduction of communication and processing overhead. The same basic algorithm can be implemented in a number of ways, depending on the type of network and operational conditions in which the algorithm has to be used (e.g., packetradio networks, land-based networks with telephone lines). In this paper, we describe DUAL at a level of detail that is independent of implementation considerations, but is sufficient to prove that DUAL is loop free and correct. We assume that either of two well-known algorithms is used at each node to compute shortest paths once topology information is gathered and updated using DUAL-Bellman and Ford’s distance-vector algorithm [FORD-621 or Dijkstra’s link-state algorithm [DIJK-591. 3 Distributed Update Algorithm (DUAL) From the previous section, we observe that all previous algorithms for loop-free routing use distance vectors. In this paper, we present the distributed update algorithm (DUAL), which generalizes the algorithms based on diffusing computations outlined in the previous section by separating the mechanism used for diffusing the computation of an SDAG from the contents of the nodal routing tables and topology tables. Chandy and Misra proposed a distance-vector algorithm based on an adaptation of DSA very similar to the one summarized above [CHAN-821. We point out that this adaptation works correctly for a single computation only. Independently of the work by Chandy and Misra or Dijkstra and Scholten, Jaffe and Moss [JAFF-821 showed that no routing loops can occur in a distributed implementation of the Bellman-Ford routing algorithm [FORD-621 after a link addition or a link-cost decrease. Using this result, they proposed a loop-free distance-vector algorithm that supports multiple, concurrent, diffused SDAG computations for the same network destination, and which requires no internodal coordination unless link costs increase or resources fail in the network. In the absence of link-cost increases or resource failures, the Jaffe-Moss algorithm behaves like the Bellman-Ford routing algorithm, that is, no diffused SDAG computation is used. Kowever, when a node detects an increase in the cost of alink, it starts a diffused SDAG computation for each destination for which the node’s distance was affected by the change. The new SDAG is computed using an algorithm similar to the adaptation of DSA described above. Multiple diffused SDAG computations are supported concurrently by means of bit oectors, maintained at each node, which specify, for each neighbor and for each destination, how many queries need to be answered by the neighbor node and which one of such queries was originated by the node maintaining the bit vector. 3.1 Messages There are three types of messages in DUAL-updates, queries, and replies. Each message contains a type tag that specifies any one of the three types of message and a message identifier that uniquely identifies the message. In addition, updates and queries contain an update list of one or more entries, each of whose contents is dependent on whether link states or distance vectors are used. A reply is sent in response to all the update-list entries in a query. In this work, we assume that updates, queries, and replies are exchanged on an event-driven basis. When distance vectors are used, an update-list entry contains, as a minimum, the identifier of a destination and the current shortest distance to it. In contrast, when link states are used, an update-list entry consists of the identifier of a link and its cost. 3.2 Finally, the author presented a loop-free distance-vector algorithm similar to the Jaffe-Moss algorithm [GARC88a]. According to this algorithm, when a node needs to update its routing table for a given destination j, after it processes an update message from a neighbor or detects a change in the cost or availability of a link, the node tries to obtain a new feasible successor with the shortest distance to node j. From the standpoint of node i, a feasible successor toward node j is a neighbor node that has reported a distance to the destination that is at most as large as the distance reported by node i’s current successor before node i decided to update its routing table. When feasible successors are found, the algorithm behaves as the Bellman-Ford algorithm, or else, if a node cannot find a feasible successor with the shortest distance to node j, the node starts a diffused SDAG computa- Information at Each Node Each node in the network maintains a routing table, a link-state table, a query-origin table, a reply-status table, and a topology table. The routing table of node i is a column vector of IN] row entries, where N is the set of active nodes in the network. At a given time t, the entry in node i’s routing table for destination node j consists, as a minimum, of the identifier of node j, the length of the current distance from node i to node j (denoted by of(t)), the successor in the path currently chosen by node i toward node j (denoted by s:(t)), and the length of the distance from node sj to node j (denoted by D?(t)) before node i becomes 214 Definition 6 A node z is said to be the predecessor of another node toward node j if the latter is the successor for node x toward the same destination. active. The successor is null when node i determines that node j is unreachable. The link-state table of node i is a vector of INil entries. Each entry corresponds to the cost of a link adjacent to the node maintaining the table. The cost of the link from node i to neighbor node L, as maintained by node i at time 2, is denoted by l:(t). The cost of an inactive link is set to infinity (i.e., an arbitrarily large value). Definition 6 Node i is busy if it is active with respect to at least one destination, and is idle otherwise. In the following description of DUAL, we always refer to a node’s state (active or passive) with respect to destination j, and refer to node j, distance to node j, SDAG for node j, and successor toward node j simply by destination, distance, SDAG, and successor, respectively. Furthermore, we make reference to a single entry of the routing table. The query-origin table of node i is a vector of INI entries. At time t, each such entry specifies the identifier of a node j and a query-origin flag, denoted by o:(t). The reply-status table of node i is an INI . INil matrix. Entry (j, k), k E N;, consists of a reply-status flag, denoted by r&(l), which specifies whether or not node i is waiting at time t for a neighbor’s reply to a query. 3.3.2 3.3.1 Similarly, DUAL differs from the Gaffe-Moss algorithm in that the transition from active to passive state in the latter is dictated by whether or not a node detects an increase in its distance to the destination, regardless of how its own distance compares with its neighbors’ distances. Operation In Figure 1, we indicate the information received at a node, the information sent by the node, and the updating of the reply and query-origin tables, according to the state in which the node is and the information it receives. However, we do not specify how routing, link-status, or distance tables are updated. Definitions The operation in DUAL As shown in Figure 1, DUAL has two main states with respect to each destination, just as DSA does: active and passive. The main difference between DUAL and DSA is that the transition from the active to the passive state is not immediate after the detection of a change in the cost or status of a link. Rather, the transition is dictated by whether or not a node has a feasible successor. In addition, DUAL supports multiple diffused SDAG computation, ensuring that each computation is atomic and is processed in the proper sequence. When distance vectors are used, each entry of node i’s topology table at time t consists of the identifier of a destination j and the length of the current distance from neighbor node L to node j, for each of node i’s neighbors (i.e., for all L 6 Ni), denoted by Djk(t). In contrast, when link states are used, the topology table consists of a link-cost matrix that contains an entry for each pair of network nodes with the cost or weight of a link; the cost of a link that does not exist is assumed to be infinity (i.e., arbitrarily large). We note that node i can obtain D&(t) for any k E Ni directly from the link-cost matrix. 3.3 States of DUAL is based on the following definitions. Definition 1 Node i is active with respect to destination node j when it is waiting for replies from its neighbors referring to node j, and is passive with respect to node j otherwise. If node i is active with respect to node j, it cannot modify si in its routing table; furthermore, node i’s distance must be set to Dj = 3.3.3 Multiple Destinations When distance vectors are used, each update entry of a query or update refers to a specific destination, and a node receiving a query with an update entry for destination j knows that the sending uode is active with respect to node j. Accordingly, a diffusing SDAG computation for destination j is completely independent from any other destination’s SDAG computation. Definition 2 FC (Feasibility Condition): Consider a passive node i that has as its successor to node j node F at time t’ (i.e., sj(t’) = F) and Dj(t’) < 00. Consider also that, at time t > t’ node i detects a change in D&(t)) + $(t’). Node i can choose as its new successor to node j any neighbor q E Ni for which + l;(t) ( z E Ni} < 00 q*(t) + lip@) = Dmcn(t) = Min{Dj,(t) and DjJt) 5 DjF(t’) < 00. We also note that a node detecting a change in the status or cost of a link may have to send both an update and a query when its routing table is updated; the node sends the update with its shortest distances to destinations for which it has feasible successors, and sends a query regarding destinations for which the node has no feasible successors. This query contains the distances through the current “frozen” successor. Definition 3 A feasible successor at time t for node i toward node j is a neighbor node q E N; that satisfies FC. On the other hand, when link states are used, the information exchanged in updates and queries refers to link costs only. Therefore, when a node receives a query with an update entry for link (2, y), it cannot determine the destinations for which the sending node is active. Because of this, an idle node that becomes busy (see Definition 6) cannot participate in any other SDAG computation until it becomes idle again. That is, a passive node that Definition 4 The shortest distance for node i toward node j is equal to Min{D&(t) + Z:(t) 1z E Ni}, where node z must be a feasible successor. This entry is set to infinity (i.e., an arbitrarily large value) when node i determines that node j is unreachable. 215 3.3.5 becomes active for at least one destination and sends out a query must act as if it was active for all destinations when it receives subsequent queries and updates, until it becomes passive again for all destinations. When the node receives the replies to its query and becomes idle, it must recompute all its routing-table entries. 3.3.4 Event Handling in Passive Once active, node i stays that way until it receives the replies from all its neighbors. When node i is active and receives a reply from node k, it sets $, = 0; therefore, node i becomes passive whenr&=OforallkENi. Sand update if needed. \ / I Raceiva link than or update and fe&ible successor not found. setd &I - 1 for all neighbors. Become passive again (o,’ - 2). no feasible su~~tssor exists. / Send update if needed. send reply. 3.3.6 Receive last reply from k and be so”rc8 of query of quety pending. (0,” Link (i.k) fails, k is successor, ri islastl. ------- I Set 0; - 3, send query. set rr’ - 1 for all neighbors. \ Link (i.k) fails or receive reply from k. more replies pending. K is not 1 1 \\I I I Send re~lv to succ8ssor. SUCCBSSO,. (not from suaessor) set r; 5 0 Send reply. set r; =o.oii=2 to Passive State Finally, if 0: = 2, then node i attempts to find a neighbor that satisfies FC using Dr (i.e., the distance through its old successor, used when node i became active in the first place). If a feasible successor is found, then node i sets 0: = 0, recomputes its shortest path and successor, updates its routing-table, and Receive query frc Receive update. Active If 0: = 3, node i follows the same steps as for the previous case; however, before node i modifies its successor, it sends a reply to that neighbor. Link (i.k) fails, k is s”ccBssof. more replies pending. from If 0: = 1, then node i sets o$ = 0, recomputes its shortest path and successor and updates its routing-table. If distance vectors are used, node i sends out an update if any change was made to l?j. If link states are used, node i sends an update with the contents of its link-update list, unless it is empty. ---r I -.-..- .ly relay (Cl)’ - 3). 1, Transition There are three different courses of action that node i can take when it receives the last reply to its query and becomes passive. 1a set,‘-0. ti not Receive query and feyibfe swxessor *XM.% / \ P 1. Receive query and feasible successor found. In the case of distance vectors, because an update or query contains routing information that must be modified at every hop, each needs to traverse a single hop only.2 The situation is very diffeiTent when link states are used, however. In this case, an update or query contains topology information that must be the same in every node’s topology table; accordingly, the same update must be propagated to all network nodes. However, because no new topology information should be sent or processed by an active node, an active node stores all the changes in link costs detected or received in messages in a temporary list called the link-update list; this information is processed when the node becomes passive again. Furthermore, additional mechanisms must be used with link states to ensure that the same update does not traverse the same nodes or links an unnecessary number of times; we assume the existence of such mechanisms in this paper.3 Become passive qain (O/-2), fessible 8uccessoT exl818. Send update if needed. State If node i is active and receives a query from a neighbor k other than its successor, it updates its topology table and remains active. In addition, it sends a reply to node k. No other action is needed. On the other hand, if the query comes from its successor, then node i must be the origin of the previous query, for which node i is still waiting for replies; accordingly, node i sets oj = 2 after updating its topology table and remains active. When node i is passive and cannot find a feasible successor after processing an input event, then it becomes active (see Figure 1) by setting r$ = I for zdl k f Ni, and sends out a query to all its neighbors with the value of Dj = Djb + If, where s = sj. If node i becomes active because of an update, a link failure, or a change in the cost of a link, then it sets oj = 1. Alternatively, if it becomes active because of a query from a neighbor, node i sets 0: = 3. Send update if needed. in Active If an active node i receives an update or detects a change in the cost of a link, then node i simply updates its topology table (and its link-status table if needed) and remains active. State Become passive again. (0; D 3 or 1) Handling Updates are distributed differently across the network depending on whether distance vectors or link states are used, However, whether link states or distance vectors are used, once an active node sends a query to its neighbors, it cannot send any update or query until it receives sJl the replies to the query in progress. When node i is passive and receives an update, a query, or a change in the cost of a link, it updates its topoIogy table and, in the case of a link change, its link-status table. Using the new information in its topology and link-status table, node i attempts to find a feasible successor (as defined in Definition 3). If a feasible successor is found, node i updates its routing table, remains in the passive state and, if needed, sends out an update to ail its neighbors with the new topology or routing information (see Figure 1). If the input event was a query from a neighbor, node i sends a reply to that neighbor. Node i is free to process other input events (updates, queries, or changes in link costs or status) immediately after sending the update. Receive link change or update and feasible sucmssor exists. Event lir lk-cost change or ‘The reader is referred to a number of previous publications on the subject [JAFF-821 IJAFF-861 [GARC-88a] [GARC-891. 3The reader is referred to previous publications for a detailed description of these mechanisms [JAFF-861 [MCQU-SO]. Figure 1: Active and Passive States in DUAL 216 sends a reply to its old successor if the link with that neighbor still exists. Node i sends out an update if any change was made to Dj and distance vectors are used. On the other hand, if no feasible successor is found, node i sets o$ = 3 and T$. = 1 for all Ic E N; and sends another query, which is the query that came from its successor after node i had become active. When distance vectors are used, the new query specifies Dj = DfF + 1;; alternatively, if link states are used, node i’s query contains the contents of its link-update list. (2) (1) a d b f (0) (3) C e a (2) (1) Figure 2: Network Used in Examples X3.7 Link Failures and Additions Special measures must be taken for the processing of link failures and additions, to ensure that a node i does not wait forever for a reply from a neighbor and uses no upstream neighbors as successors. - i When node i is passive and detects that link (i, Ic) has failed, it updates its link-status and topology tables with 16 = 00; if distance vectors are used, it must also set Dj, = co. After that, node i carries out the same steps used for the reception of an update in the passive state. a When node i is active and detects the failure of link (i, Ic) and Ic # s$, it updates its link-status and topology table accordingly, and sets T& = 0, to account for the fact that node i should not expect a reply from node Ic. No other action is needed, unless node i becomes passive after setting r$ = 0. e b C d - - i 0 a 1 b 1 C 2 d 3 e ';b 1 f 2 f - - 64 On the other hand, when node i is active and detects the failure of link (i, k) and Ic = sj, node i updates its link-status and topology table accordingly, and assumes that its successor replied to its query (i.e., node i sets T& = 0). In addition, node i must make sure that, when it receives all the replies of the query for which it is active, it will not choose a neighbor upstream by mistake. To achieve this, node i remembers the distance from its successor to the destination when node i became active in the first place (Of), and sets 03 = 2. No other action is needed, unless node i becomes passive after setting T$, = 0. Routing Table - - i Topology Table - - S; a d a - - - e f 00 00 1 00 00 C 1 00 1 d 00 1 1 e 00 0 1 00 1 0 a b When node i is active and detects the establishment of link (i, k), it updates its link-status and topology tables as needed. Node i sets TL~ = 0 and Dik = 00 for all m E N, to account for the fact that node R does not owe any reply to node i and has not reported any distances since the new link was established. After that, node i sends an update to node k that contains node i’s entire routing or topology table, depending of whether distance vectors or link states are used. 4 I); 1 Topology Table Routing Table - 0 b f - 00 - W Figure 3: DUAL Tables at Node a In the two examples presented in this section, we consider the case in which the link (f, e) fails. We note that there are two queries originated by this failure-one by node f and one by node e. However, to compare the behavior of DUAL when distance vectors and link states are used, we focus our attention on the query originated by node f, and quantify the number of steps and messages that it takes for all the nodes to recompute the successor tree for node e. Examples In this section, we present examples of the way in which DUAL operates after resource failures or link-cost increases. Figure 2 depicts a small network with six nodes. Links are assumed to have unit cost. An arrowhead in the directed link from node z to node y in the figure indicates that node y is the successor toward node e (i.e., S: = y). The label assigned to node z indicates the value of 0,“. Figure 4 shows the changes in the routing-table entry for node e at each node when distance vectors are used. An update is indicated by an arrow adjacent to the link where it is transmitted, followed by a “U”. Replies and queries are similarly identified with an “R” and a “Q”. We assume that a reply contains an update list with updated routing-table entries. Figure 3(a) illustrates the routing table and topology table maintained at node a when distance vectors are used. Figure 3(b) illustrates the same tables when link states are used. 217 After node f detects the failure of link (f, e), it determines that it has no alternate feasible successor to node e (i.e., D,f, > DL and DL > DL). Accordingly, it becomes active by setting rf, = rid = 1 and r,f, = 0 (the last flag is set to 0 to account for the link failure). Node f also sets o{ = 0 to account for the fact that it does not have to send a reply to node e when it becomes passive again. When node e becomes active, it sends a query to its neighbors (nodes d and c). We note that the query originated by node f needs only to contain an entry for node e, because the failure of link (e, f) does not affect the distance from node e to the rest of the other nodes. =-L \’ NOT’CONSIDERED < - Node d simply sends a reply to node f assoon as it processes its neighbor’s query, because its feasible successor (node e) is not affected. Node c also sends a reply to node f, because it is able to find a new feasible successor to node e (D&, 5 D&). Accordingly, nodes c and d remain passive. When node f receives the two needed replies to its query, it knows that it does not have to send a reply to its current successor (node e), because o! = 0. It then changes its successor to node e, and sends an update to nodes c and d. The algorithm terminates when these two nodes update their topology tables using node f’s update. W (1) We want to point out that, because of the connectivity among nodes e, f, and d, the query originated by node e would likely refer only to node f, and would not affect the routing table of node d. (2) W Figure 4: Example of DUAL Operation Figure 5 shows the same example when link states are used. In Figure 5, the queries and updates specify an infinite cost for link (e, f). We assume the use of an update forwarding mechanism similar to the one used in the new ARPANET routing algorithm [MCQU-SO] [SCHW-861, whereby an update needs to flow on each link only once in each direction, and a node never forwards an update to the node from which it came.4 with Distance Vectors Notice that the only difference in the two cases is the fact that the updates originated by node f and node e propagate to all network nodes. We point out that the query originated by node e would cause the same number of updates created by the query originated by node f. 5 CONSIDERED Q A Sufficient Condition for Loop Freedom Using Dijkstra and Scholten’s diffusing computations technique can be applied to achieve loop freedom in distributed routing algorithms. However, it introduces additional communication overhead, because, when a node needs to modify its routingtable entry for a given destination, a query must be propagated (1) 03 Operation of DUAL Throughout this section, we refer to a node’s state (active or passive) with respect to destination node j, and refer to node j, distance to node j, SDAG for node j, successor toward node j, and path to node j, simply by destination, distance, SDAG, successor, and path, respectively. Furthermore, we denote by p;(t) node z’s preferred path to destination j at time t, and by P!‘(t) the path from node x to destination j known by node y at time t. 5.1 Figure 5: Example of DUAL Correctness ‘Although such a mechanism provides substantial savings in communication overhead, it adds substantial complexity in the implementation of the algorithm [PERL-831 [JAFF-861. with Link States 218 If a loop is to be formed at time t because of node i’s action, P,“(t) must include P:(t). If Pia = 4(t), it follows that Djo(t) = DT(t’) = D&(t) > D&(Y), because no loop exists before time t and all link costs are positive. This, however, contradicts FC*, and so it follows that FC*is sufficient for this case. to all nodes upstream of that neighbor in the destination’s SDAG. Hence, the question arises as to whether it is possible for nodes to modify the SDAG of a given destination without any internodal coordination, and still maintain loop freedom at every instant. Consider a network G’ with an arbitrary topology; and assume that an arbitrary link-state or distance-vector algorithm is used, such that (1) there is no internodal coordination for the updating of routing tables and (2) each update is processed independently of others. The following theorem demonstrates that FC*, as defined below, is sufficient to ensure loop freedom at every instant in G”. Consider now the case in which PT(t’) # P;“(t). Let sf(tl) = old and let us assume that node x changes its successor exactly k times (where k 2 1) in the time interval (tl, tz), where newk are node x’s consecutive successors within newl, news,..., that time interval, and newk = sT(tz). Then, FC* requires that the following inequalities be satisfied: Definition 7 FC*(Feasibility Condition with no Internodal Coordination): Consider a node i that has as its successor to node j node F at time t’ (i.e., s$(t’) = F) and Dj(t’) < 03. Consider also that, at time t > t’ node i detects a change in D&(t’) + l$(t’). Node i can choose as its new successor to node j any neighbor q E N; for which L$(t) + Z:(t) = Min{D~,(t) + Ii(t) 1 z E N; and D:,(t) 5 DiF(t’)} < co. . . . I D;newl(tnew,)i D;odh), wheretnew,< tnew,+,is new, the time when node z makes node its successor. Accordingly, because alI link costs are positive, we can traverse the directed path P;“(t)c P:(t) at time t and find that the following inequalities must hold: 1 Zf Sj(G’) is loop-ffee, no loop can be created by increasing any of its nodes’ distances while maintaining the same successors. 0 Proposition DjJt) = Dja(t’) > D;+,,d)(t’) The reason for this is that S;(G*) does not change. 2 Zf a loop Lj(t) is formed in Sj(G*) for the first time at time t, then some node i E Lj(t) must choose an upstream node as its successor at time t. q Proposition This is evident from the fact that Sj(G*) acyclic before Lj(t) is created. Theorem suficient is directed D”(k-l,new) Js(k,old) and b(k, old)) 1 D$$;$‘)(t) > Djd(a(k~~)old)(t~(k+l,old)) 1 Using FC* when nodes choose their successors is to ensure that Sj(G*) is loop free at every instant. 0 D?!“(t) 3% Proof: Note that, because we can assume any distance-vector or link-state algorithm, we can assume that each node in G* knows an entire path from any other node to the destination. However, this information may be out of date; this forms the basis of our proof, which is by contradiction [GARC-88a]. where > Di32(tz) 2 Di3a(t) ’ g+lvne4 3 d(k,newJ (t) indicates that node s(k - 1, new) is the kth hop in the path P:(t) at time t and has node s(k, new) as its successor in the path. Similarly, D$;~~~~‘(&ck, &)) indicates that node s(k- 1, new) had node s(k, old) as its successor at time h(k, old) < t, where t,(k, old) is the time at which node s(k- 1, new) updated its distance table for the last time prior to time t because of an update from its neighbor s(k, old). Lastly, so”’ = i, s$) = x, and t, < t. Assume that, before time t, Sj(G*) is loop free at every instant. Furthermore, assume that a loop Lj(t) is formed in Sj(G*) at time t. From propositions 1 and 2, we know that at least one node must change its successor at time t and choose an upstream neighbor for a loop to be formed. Therefore, we only need to show that FC* is sufficient to ensure loop freedom when at least one node in G* changes its successor at time t. Because these inequalities lead to the erroneous conclusion that D&(t) > D:,(t), it follows that no loop can be formed in Sj(G*). Therefore, it follows that FC*is sufficient in this case too. Let node i be the node that creates Lj(t) when it makes node a its new successor at time t, when it detects an increase in Dj = Df* + ld, where s = sf # a. FC* requires that DJ(t’) = of,(t) 5 D$(t”) = Di(t” - Q). In this expression, t’ < t is the time of the last update from node a = s:(t) that is processed by node i prior to time t. Similarly, t” - Eb is the time of the last update from node b = si(t”) that is processed by node i at time 1” < t. The operation of any routing algorithm is such that, when the network is first initialized, all each node in G’ knows is how to reach itself; this is equivalent to saying that a node has a routing-table entry for each of the other nodes in the network with infinite distance and no successors to them. Hence, at time 0, Sj(G*) is a disconnected graph of one or more components, each with a single node, and must be loop-free. Therefore, we conclude that FC*is sufficient. 0 219 There are three types of action that node i can take when it becomes passive, depending on the value of o$. If o: = 2, then node i cannot change its successor, unless it finds a neighbor that satisfies FC using the distance from its successor to the destination before node i became active (0:). However, this is equivalent to the third case above, for which we showed that no loop can be created. Although FC* ensures loop freedom at every instant, it does not guarantee shortest paths in the resulting SDAG. To obtain loop freedom at every instant and shortest paths in the SDAG of each destination, DUAL modifies FC*, as defined in FC. When a node i attempts to update its successor and FC is not satisfied, DUAL uses diffused computations to distribute the new path length perceived by node i through its current successor to all upstream nodes, forcing node i to learn all new distances from upstream neighbors to the same destination before it is allowed to change the SDAG. 5.2 If O$ equals 1 or 3 and node i becomes passive at time t, then node i can create no loop if either node b remains node i’s successor or node i has no successor. Therefore, if a loop is to be created by node i at time t, it must occur when node i chooses a new successor without using FC after receiving the last reply to its query or detecting the failure of the link with the neighbor (different from node b) whose reply was pending. However, it follows from the correctness proof of DSA [DIJK-801 [CHAN-821 that node i cannot receive all the replies it needs from its neighbors before each one of its upstream neighbors forwards node i’s query to all its neighbors and receives the corresponding replies from them, including node i. Furthermore, an active node cannot send new routing or topology information in an update or a new query, until it receives the replies to the query for which it is active. Accordingly, when node i gets the replies from all its neighbors, each must either not be upstream of node i any more or must have reported to node i a distance to the destination that includes its distance to node i. Therefore, no loop can be created in this case. Loop F’reedom in DUAL Because a routing-table loop is created with respect to a particular destination, and because nodes coordinate with respect to the same SDAG, we can focus on a single SDAG to show that DUAL is loop-free at every instant. We refer to node i and the set of nodes upstream of node i that become active because of a query originated at node i as active SDAG of node i, and denote it by Sji(G). Lemma 1 Consider a network G in which DUAL is executed. If only a single difised SDAG computation takes place at any one time, then Sj(G) is loop-free at every instant. 0 Proof: It follows from the above that no loop can be formed when Sj(G) is loop-free before time t and a single diffused SDAG computation occurs. However, as pointed out in the proof of Theorem 1, Sj(G) must be loop-free at time t = 0, and so the lemma is true. 0 By contradiction. Assume that Sj(G) is loop free up to time t and that node i is the first node in G to create a loop in Sj(G) at time t, after processing an input event. Because an active node cannot change its successor after processing an input event, unless it becomes passive, it follows that, for node i to create a loop at time t, it must either be passive or become passive at time t. Let node b be sj when node i decides to change its successor at time t. Lemma 2 DUAL considers each diffused SDAG individually and in the proper sequence. 0 Proof: Assume that node i is passive at time t, when it processes an input event. According to DUAL, D?(t) = D$,(t) and node i can carry out one of the following actions: 1. Determine computation Consider the case in which the network has a stable topology and node i is the only node that can start diffused SDAG computations after detecting changes in the cost or status of adjacent links. If node i generates a single diffused SDAG computation, it follows from the previous lemma that every node in Sji must process node i’s query within a finite time and in the proper sequence, because there is only one diffused computation. that node b satisfies FC. 2. Determine that none of its neighbors satisfies FC and become active. 3. Find a neighbor a # b that satisfies FC. Assume that node i can generate multiple queries. Whether distance vectors or link states are used, an active node cannot send a new query or update before it receives all the replies to the query for which it is active. Therefore, because all the nodes in S+(G) process each input event in FIFO order and all links transmit in FIFO order, it follows that all the nodes in Sji(G) must process each diffused SDAG computation individually and in the proper sequence. In the first case, node i cannot create a loop at time t, because no link is changed in Sj(G). In the second case, node i cannot create a loop at time t, because it must keep the same successor and cannot, therefore, modify Sj(G). In the third case, either all the nodes in the new path from node i to node j (P:(t)) are passive, or at least one node in that path is active. It follows from the proof of Theorem 1 that node i cannot create a loop at time t when it switches successors according to PC if all nodes in P;(t) are passive. On the other hand, any active node 2 E Pj(t) in the only diffused SDAG computation in G must become active without changing its successor, which had to be chosen according to FC. Accordingly, if we traverse P,‘(t) C Pj”(t) at time t, we arrive at the erroneous conclusion that D;,,(t) > o&(t), just as we did in the proof of Theorem 1. Consider now the case in which multiple sources of diffused SDAG computations exist in the network, but no topological changes can take place. If any two of the active SDAGs of G have an empty intersection, then we can conclude from the previous case that the lemma is true. Hence, in the absence of topological changes, incorrect treatment of queries can only originate at a node that belongs to an intersection of active SDAGs, i.e., one that is waiting for replies to two consecutive queries. However, this is impossible by design. 220 that at least one of its upstream neighbor nodes, ur , never sends a reply for node i’s query. In turn, this implies that at least one of node ur’s upstream neighbors, us, never sends a repIy for node ~1’s query. If we pursue the same line of argument, it follows that, because Sj(G) is finite and loop-free at every instant of time (as shown before), there must be a node u, that never leaves the active state (i.e., never unfreezes sr=(t”), t” > t’), such that no node in G is upstream of it in Sj(G), which we have shown above to be impossible. Therefore, because SDAGs are computed independently for each destination, a finite time after t, no node can be active. Finally, consider the possibility of topological changes-link or node additions and deletions. Because the addition and deletion of nodes are handled by DUAL as multiple link additions or deletions, we only have to consider link changes in our proof. For a given destination, a node can become active in only one diffused SDAG computation at a time, and can, therefore, expect at most one reply from each neighbor. Accordingly, when an active node loses connectivity with a neighbor different from its successor while it is expecting a reply from it, the node can simply assume that its neighbor sent a reply indicating an infinite distance. In addition, when an active node i loses connectivity with its successor, it assumes that its successor sent its reply and sets oj = 2. So, when node i becomes passive again, it attempts to find a neighbor that satisfies FC using the distance reported by its old successor before node i became active in the first place (i.e., Dy). Accordingly, the FIFO order in which node i processes queries does not change with link failures. Next, we observe that the updated value of a link cost is broadcast to every node in the network within a finite time after the link changes its cost, because no node can be active forever, because we assume a reliable flooding mechanism of messages when nodes are passive or become passive, and because a node that becomes passive processes all link-cost updates in its linkupdate table. Hence, a finite time after the occurrence of all topological changes, every node in the network has an up-to-date link-cost matrix. Because all nodes must become passive and have up-todate information in their link-cost matrices a finite time after t, and because each executes a finite-time, shortestpath-first algorithm, it follows that the theorem is true. q On the other hand, when node i is active and establishes connectivity with a node k, it treats the new neighbor as one that already sent its reply to the pending query. It thus follows that the FIFO order in which node i processes queries does not change with the establishment of new links or link failures. Hence, because the occurrence of topological changes does not change the order in which a node processes or forwards queries, it follows that the lemma is true. q Theorem 2 Consider a network G in which DUAL Sj(G) is then loop-free at every instant. 0 Proof: 5.3 The proof follows directly Convergence 6 Algorithm Complexity is ezecuted. In this section, we analyze the complexity of DUAL when distance vectors and link states are used, and compare these two cases with each other and with traditional distance-vector and link-state algorithms. For this comparison, we assume that all the algorithms under consideration behave synchronously, so that every node in the network executes a step of the algorithm simultaneously at fixed points in time. At each step, a node processes a single input event (a message from a neighbor node or a change in link status or cost). The first step occurs when at least one node detects a topological change and issues updates or queries to its neighbors. During the last step, at least one node receives and processes updates or replies from its neighbors, after which all routing tables are correct and nodes stop transmitting update messages until a new topological change takes place. from lemmas 1 and 2. 0 of DUAL We have shown that DUAL provides loop-free routes at every instant. However, we have not yet shown that it converges (i.e., that every node stops sending update messages and all routing tables contain correct values afterwards) a finite time following an arbitrary sequence of topological changes in the network. We consider the convergence of DUAL in a network G that is in equilibrium at time 0 and is subject to a finite number of arbitrary topological changes in the time interval (0, t), with no more changes taking place afterward. Our comparison is made in terms of the number of steps and number of messages required for each algorithm to converge when a single link failure occurs after all the network nodes have obtained optimal paths to every destination. We refer to the number of steps required by an algorithm as its time complezity (or TC), and to the number of messages it requires as its communication complezity (or CC). We also consider the storage space required by each algorithm, which we calI storage complezity (or SC). Theorem 3 Consider network G in which DUAL is executed. A finite time after time t, no new update messages are being transmitted or processed by nodes, and all entries in all routing tables are correct. q For the case in which distance vectors are used, Proof: the reader is referred to the convergence proof of the loop-free, distance-vector routing algorithm presented elsewhere by the author [GARC-88a]. When distance vectors are used, DUAL has SC = O(]N] . D), where IN] is the number of network nodes and D is the maximum degree of a node, and has the same worst-case time complexity as the Jaffe-Moss algorithm, which is TC = O(z), where z is the number of nodes in an active SDAG. To verify this, we note that, in the worst case (e.g., the failure of a destination node) all nodes upstream of a destination node i in Sj(G) must become active. DUAL requires O(Z) messages to converege when distance vectors are used [JAFF-821. Traditional distance-vector algorithms have TC = O(]N]) and CC = O(]N12) [SCHW-861, and similar SC than DUAL. For the case in which link states are used, we first observe that, according to DUAL’s operation and our assumption that messages are transmitted reliably, a node e E G for which no node in N, is upstream of it in Sj(G) must receive all the replies from its neighbors within a finite time after it sends a query. Let us now assume that there is a node i that never becomes passive after becoming active (when it freezes si(t’), t’ > 0). This implies 221 initiated downstream of it, and has received the replies for all the queries initiated by it. Unfortunately, this technique fails to improve performance in all cases in which multiple resource failures occur and transmission times vary. The reason for this is that a node (call it n) in the wait state that detects the failure of the link with its successor must assume that its successor has sent its reset; otherwise, the node would be waiting for that reset forever. Hence, even if node n sends another query to account for the link failure and does not send a reset until it receives the replies for the new query, all of this occurs asynchronously from node n’s old successor. Accordingly, when node n becomes passive and propagates the new reset message upstream, this may cause a node upstream of n to choose as its new successor a neighbor that has not heard the query originated downstream from node n that made that node become active in the first place. On the other hand, when link states are used, DUAL has SC = O(]N]‘), because every node maintains a link-cost matrix. Typically, D << IN], and storage requirements are much larger when link states are used. Because any change in the cost or status of a link must be propagated to all the nodes when link states are used, DUAL has TC = O(y) in this case, where y is the maximum of z, the number of nodes in the active SDAG, and d, the diameter of the network. For the same reason, DUAL has CC = O(]E]) (although additional queries and replies must be transmitted in addition to the flooding of updates, this requires O(z) messages and typically O(]E]) > O(z)). Traditional linkstate algorithms have TC = O(d), and similar SC and CC than DUAL. In a typical packet-switching network in which there is more than one path from a node to each destination, DUAL can perform much better with distance vectors than with link states after three types of topological changes-link failures that do not partition the network, link cost changes, and link additions. The reason for this is that updates travel only as far as necessary when distance vectors are used. No more than one update may be needed in this case, while at least IN] messages have to be exchanged when link states are used. Furthermore, nodal coordination in DUAL is much stronger when link states are used, because a node must carry out the computation of the SDAGs for all destinations in parallel after a topological change occurs. 7 Another way of enhancing DUAL’s performance is by means of probe messages used only after link failures occur. According to this approach, a passive node i that detects the failure of its link to a destination or receives an update from its successor reporting an infinite distance sends a probe message to the destination if and only if it has a neighbor n that satisfies FC. Node i uses node n as its new successor if it receives a positive answer message from the destination. Node i forwards the query if it receives no answer message within a hold-down time (proportional to an estimated transmission time per link times twice the distance to the destination through node n) or receives an update from node n reporting an infinite distance to the destination. The same procedure is followed by an active node k with an infinite distance to the destination that receives the last reply to its query and has at least one neighbor that has reported a finite distance to the destination. A probe message asks a destination for an answer stating its existence. Enhancements The algorithm that we have presented in Section 3 is not meant to be very efficient in terms of the number of messages exchanged among nodes to update their routing tables. The reason for this is that we wanted to describe DUAL independently of the type of information stored in topology tables. A number of enhancements can be made to DUAL to make it more efficient. For instance, when distance vectors are used, the number of messages exchanged after a link-cost increase or a resource failure can be reduced by allowing each message to contain update, query, or reply entries, each with an updated distance to a destination. Similarly, for the case of link states, we have assumed that a node must be passive with respect to all destinations (i.e., idle) to be allowed to change its successor to a given destination without coordinating with other nodes. Less restrictive link-state algorithms can be designed by requiring nodes to keep information as to which routing-table entries are affected by which queries. Using probes reduces the time complexity of DUAL to O(h) after a single link or nodal failure. The reason for this improvement is that a node does not change its successor after it reaches an infinite distance, unless it receives a positive answer from a this technique fails neighbor that satisfies FC. Unfortunately, to enhance the performance of DUAL when a node i receives a positive answer from one of its neighbors n and then a link in the path from n to the destination fails. This technique is based on the algorithm proposed by Naylor [NAYL-751 for loop-freedom, in which a loop-check message is sent out by a node every time it has more than two neighbors and must change its successor. However, in the same way that probes fail to enhance DUALS performance in some cases in which link failures occur, Naylor’s algorithm fails to eliminate looping or counting-to-infinity. More importantly, we note that the performance of any routing algorithm based on diffusing computations can deteriorate dramatically in the case of node failures or network partitions, in which all the remaining nodes in the network must be involved in a diffusing computation. Jaffe and Moss[JAFF-821 have suggested the addition of a wait state to improve the performance of their algorithm. With this added state, when an active node receives all the replies to its query, it waits for a reset from its successor. When the node that initiated the diffused SDAG computation receives all the replies to its query, it sends out a reset to its neighbors and becomes passive. In the absence of link or nodal failures, this enhancement reduces the time complexity of the Jaffe-Moss algorithm to O(h), where h is the length of the longest chain of the set of nodes participating in the diffused SDAG computation. The same improvement would be obtained in DUAL. For the case of multiple failures, a node remains in wait state until it receives the reset messages for each of the queries 8 Conclusions This paper shows that developing solutions for looping problems of traditional distributed routing algorithms can be done independently of whether link states or distance vectors are used. In the preceding sections, we presented a new approach for loopfree routing in computer networks that can be applied to either distance vectors or link states. This approach is based on the concept of diffusing computations, first introduced by Dijkstra and Scholten. We presented a new loop-free routing algorithm (DUAL) and verified that it provides loop freedom at every instant and correct routing tables within a finite time after a sequence of topo- 222 [GARC-88b] J.J. Garcia-Luna-Aceves, “Routing Management in Very Large Scale Networks,” in Future Generation Computer Systems, Vol. 4, No. 2, September 1988, pp. 81-94. [GARC-891 J.J. Garcia-LunaAceves, “A Minimum-Hop Routing Algorithm Based on Distributed Information,” in Computer Networks and ISDN Systems, NorthHolland, Vol. 16, No. 5, May 1989, pp. 367-382. [JAFF-821 Our new approach is important for very large networks and internetworks, in which the consequences of the looping problem would be unacceptable. In such environments, our new approach should be used in combination with a hierarchical network organization to minimize routing overhead [JAFF-881 [GARC88b]. J.M. Jaffe and F.M. Moss, “A Responsive Routing Algorithm for Computer Networks,” in IEEE Transactions on Communications, Vol. COM-30, No. 7, July 1982, pp. 1758-1762. [JAFF-861 Finally, although we have adopted diffusing computations as the mechanism to ensure the termination of distributed computations, the reader is reminded that there are other techniques proposed in the literature (e.g., time stamping) to handle the problem. J.M. Jaffe, A.E. Baratz, and A. Segall, “Subtle Design Issues in the Implementation of Distributed, Dynamic Routing Algorithms,,, in Computer Networks and ISDN Systems, Vol. 12, No. 3, 1986, pp. 147-158. [JAFF-881 J.M. Jaffe, “Hierarchical Clustering with Topology Databases,‘, in Computer Networks and ISDN Systems, Vol. 15, pp. 329-339, 1988. [MC&U-801 J. McQuillan, et al., “The New Routing Algorithm for the ARPANET,” in IEEE Tmnsactions on Communications, Vol. COM-28, May 1980. [MERL-791 P.M. Merlin and A. Segall, “A Failsafe Distributed Routing Protocol,” in IEEE Transactions on Communications, Vol. COM-27, No. 9, September 1979, pp. 1280-1288. [NAYL-751 W.E. Naylor UA Loop-Free Adaptive Routing Algorithm for Packet Switched Networks,,, Proceedings of the Fourth Data Communications Symposium, Quebec City, Canada, October 1975, pp. 7.9-7.15. [PERL-831 R. Perlman, “FauIt-Tolerant Broadcast of Routing Information,” in Proceedings IEEE INFOCOM ‘83, San Diego, California, 1983. [SCHW-861 M. Schwartz, Telecommunication Networks: Protocols, Modeling and Analysis, Chapter 6, Menlo Park, California: Addison-Wesley Publishing Co., 1986. [SEEG-861 J. Seeger and A. Khanna, “Reducing Routing Overhead in a Growing DDN,” in Proceedings MILCOM ‘86, Monterey, California, October 1986, pp. 15.3.115.3.13. logical changes. We compared the complexity of the algorithm when link states and distance vectors are used, and showed that using distance vectors for shortest-path computation in DUAL is better than using link states, insofar as the algorithm’s complexity is concerned. DUAL compares favorably with respect to traditional distance-vector and link-state algorithms and all previous loop-free routing algorithms reported in the literature [JAFF-821 [MERL-791 [GARC88a]. In addition, it requires less storage overhead than the Jaffe-Moss algorithm, because it does not require the storage of bit vectors to handle multiple diffusing computations. References [CHAN-821 K.M. Chandy and J. Misra, “Distributed Computation on Graphs: Shortest Path Algorithms,,, in Communications of the ACM, Vol. 25, No. 11, November 1982, pp. 833-837. [CEGR-751 T. Cegrell, “A Routing Procedure for the TIDAS Message-Switching Network,” in IEEE Tmnsactions on Communications, Vol. COM-23, No. 6, June 1975, pp. 575-585. [DIJK-591 E.W. Dijkstra, “A Note on Two Problems in Connection with Graphs,” in Numerische Mathematik, Vol. 1, pp. 269-271, 1959, [DIJK-801 E.W. Dijkstra and C.S. Scholten, “Termination Detection for Diffusing Computations,,’ in Information Processing Letters, Vol. 11, No. 1, August 1980, pp. l-4. [FORD-621 L.R. Ford and D.R. Fulkerson, Flows in Networks, Princeton, New Jersey, Princeton University Press: 1962. [GARC-87] J.J. Garcia-LunaAceves, “A New Minimum-Hop Routing Algorithm,” in Proceedings of IEEE INFOCOM ‘87, San Francisco, California, April 1987. [GARC-SSa] J.J. GarciaLuna-Aceves, “A Distributed LoopFree, Shortest-Path Routing Algorithm,,, in Proceedings IEEE INFOCOM ‘88, March 1988. 223 [SHIN-871 K.G. Shin and M. Chen, “Performance Analysis of Distributed Routing Strategies Free of Ping-PongType Looping,” in IEEE Tmnsactions on Computers, Vol. COMP-36, No. 2, February 1987, pp. 129137. [STER-801 T.E. Stern, “An Improved Routing Algorithm for Distributed Computer Networks,” IEEE International Symposium on Circuits and Systems, Workshop on Large-Scale Networks and Systems, Houston, Texas, April 1980.