as a PDF

advertisement
A UNIFIED
APPROACH
TO
LOOP-FREE
ROUTING
USING
DISTANCE
VECTORS
OR LINK
STATES
J.J. Garcia-Luna-Aceves
Network Information
Systems Center
SRI International
333 Ravenswood Avenue
Menlo Park, California 94025
garcia@sri.com
entire network topology, or at least receive that information, to
compute the shortest path to each network destination.
Each
node broadcasts update messages, each containing the state of
each of the node’s adjacent links, to every other node in the
network.
Abstract
We present a unified approach for the dynamic computation of shortest paths in a computer network using either
distance vectors or link states. We describe a distributed
algorithm that provides loop-free paths at every instant
and extends or improves algorithms introduced previously
by Chandy and Misra, Jaffe and Moss, Merlin and Segall,
and the author. Our approach treats the problem of dis-
Over the past few years, there has been a great deal of debate as to which type of routing algorithm is better suited for
large networks and networks with dynamic topologies and varying traffic. As Schwartz points out [SCHW-861, deciding between
one or the other is highly dependent on the specific network in
which the algorithm will operate. The primary concern with
the distance-vector algorithms that have been used in the past
of routing-table loops and counting to infinity
is the formation
[JAFF-821 [GARC-891. A routing-table loop is a path specified
in the nodes’ routing tables at a particular point in time that visits the same node more than once before reaching the intended
destination.
A node counts to infinity when it increments its
distance to a destination until it reaches a predefined maximum
distance value. There have been a number of attempts to solve
the counting-teinfinity
and routing-table
looping problems of
distance-vector algorithms by increasing the amount of informs
tion exchanged among nodes [CEGR-751 [GARC-871 [NAYL-751
[SHIN-871 [STER-801. However, as discussed by the author elsewhere [GARC-891, none of these approaches solves these problems satisfactorily. On the other hand, link-state algorithms are
free of counting-to-infinity.
However, they need to maintain an
tributed shortest-path routing as one of diffusing computations, which was first proposed by Dijkstra and Scholten.
We verify the loop-freedom of the new algorithm, and also
demonstrate
that it converges to the correct
routing
en-
tries a finite time after an arbitrary sequence of topological changes. We analyze the complexity of the new algorithm when distance vectors and link states are used,
and show that using distance vectors is better
routing overhead is concerned.
1
Introduct
insofar
as
ion
All shortest-path routing algorithms used today in computer networks can be classified as distance-vector or link-state algorithms.
In a distance-vector algorithm, a node knows the length of the
shortest path from each neighbor node to every network destination, and uses this information to compute the shortest path
and next node in the path to each destination. A node sends update messages to its neighbors, who in turn process the messages
and send messages of their own, if needed. Each update message
contains a vector of one or more entries, each of which specifies
the distance to a given destination,
plus some other information
(e.g., the successor in the path [CEGR-751 or the entire path
[SHIN-871). In contrast, in a link-state algorithm (also called
topology-broadcast
algorithm [JAFF-861) a node must know the
up-to-date
version
of the entire
network
topology
at every
node,
which may constitute excessive storage and communication overhead in a highly dynamic network.’ It is also interesting to note
that
the link-state
algorithms
proposed
or implemented
to date
do not eliminate the creation of temporary routing-table loops,
which can be created whenever two or more nodes recompute
shortest paths using inconsistent views of the network topology.
Whether
‘This work was supported by SRI International IRED funds, by the U.S.
Army hearch OfJk
under Contract DAALO3-88-K-0054, and by Rome Air
Dev&pment Center and the Defense Advanced Research Projects Agency
under Contract F30602-69-C-0015.
link states or distance vectors are used, it is clear
that the existence
of routing-table
loops, even when these are
temporary,
is a detriment
to the overall network
performance.
In this paper, we present a unified approach to the solution
of the counting-to-infinity
and routing-table
looping
problems
in distributed
routing
algorithms,
and describe mechanisms
for
routing-table
loop freedom that are independent
of whether distance vectors or link states are used, the metric used to measure
internodal
distances, or the cost function used to compute shortest paths. We want to emphasize that the loop freedom we are
Permission to copy without fee all or part of this material is granted provided
that the copies are not made or distributed for direct commercial advantage,
the ACM copyright notice and the title of the publication and its date appear,
and notice is given that copying is by permission of the Association for
Computing Machinery. To copy otherwise, or to republish, requires a fee
and/or specific permission.
0 1989 ACM 089791-332-9/89/0009/0212 $1.50
‘The interested reader may want to consult Seeger and Khanna [SEEG-861
for a detailed description of the routing overhead problem associated with a
particular link-state algorithm in the defense data network (DDN), which is
a large network.
212
interested in achieving is with respect to the entries specified in
the nodal routing tables. Throughout this paper, we refer to
routing-table
loops simply as loops, and refer to a routing algorithm that is free of routing-table loops as a loop-free routing
algorithm.
We treat the distributed shortest-path routing problem as one
of diffusing computations, a concept that was first proposed by
Dijkstra and Scholten [DIJK-801. Our results generalize or improve on previous results on loop-free routing algorithms by the
author [GARC-88a] and others [CHAN-821 [MERL-791 [JAFF-821.
These results can help network designers decide between using
link states or distance vectors without having to worry about the
existence of long-lasting or temporary loops. Furthermore, our
results on loop freedom can serve as the basis for the development
of new, effective internet routing and reachability algorithms.
In Section 2, we describe distributed loop-free routing as a
problem of diffusing computations and summarize existing results. In Section 3, we present the new algorithm for the coordinated propagation of information among nodes, and describe
how distance vectors and link states can be used. In Section 4,
we present examples of the operation of the new algorithm. In
Section 5, we demonstrate that our algorithm provides loop-free
paths at every instant and that it converges to correct routingtable values in a finite time after an arbitrary sequence of topological changes. In Section 6, we compare the use of distance
vectors versus link states, and demonstrate that distance vectors
are better, insofar as overall routing overhead is concerned. In
Section 7, we summarize performance improvements on the basic
algorithm of Section 3. We conclude in Section 8 with a discussion of further design and implementation
issues raised by our
new approach.
2
Loop-Free
Routing
as a Problem
Diffusing Computations-Previous
of
Work
The fact that temporary loops do occur in existing link-state algorithms shows that looping in distributed routing algorithms is
independent of the amount of topology information maintained in
the nodal tables and results from allowing nodes to carry out updates in their local tables without sufficient coordination among
them.
A number of algorithms have been proposed in the past to accomplish loop-free shortest-path routing through internodal coordination.
The algorithm by Merlin and Segall [MERL-791 is
based on the propagation of information with feedback along a
tree rooted at a network destination and spanning all network
nodes. A major limitation of this algorithm is that the destination node (or root of the tree), or the node that first detects a
change in distance to that destination, must start each generation
of the propagation of updates, where each generation reaches an
additional hop of the tree. This creates substantial communica
tion overhead [SCHW-861.
nations. The diffusing computation grows by sending queries
and shrinks by receiving replies among the nodes participating
in the diffusing computation. The use of diffusing computations
assumes the existence of a link-level protocol that assures that
a node is always aware of who its neighbor nodes are, all messages exchanged between two nodes are received correctly and
in the proper sequence within a finite time, and each node processes input events (messages, notifications regarding the failure
or acquisition of neighbors, or notifications regarding the cost of
adjacent links) one at a time, on a first-in/first-out
(FIFO) order,
and within a finite time.
To apply this concept to routing, we first observe that, regardless of whether distance vectors or link states are used, each
nodal routing table must contain, as a minimum, the length and
successor (i.e., next node) of the shortest path to each destination.
We model a network as an undirected connected graph in
which each link has two lengths (or costs) associated with itone for each direction-and
any link of the network exists in both
directions at any one time. We assume that each node (active
or inactive) has a unique node identifier. We denote the set of
nodes adjacent to node i as Ni and call each node in that set
a neighbor of node i. Link costs are considered to vary in time
an are assumed to be always positive. The distance between two
nodes is measured as the sum of the link costs in the path of least
cost (called shortest path) between them.
For the moment, we can assume that each node mdntains a
topology table with sufficient information to compute the values
of the node’s routing table locally. Accordingly, for each destination j, all the nodal routing tables in a connected network G
define a graph whose nodes are the same nodes of G, and which
has a directed edge from node i to node k if and only if node k is
the successor of node i toward destination j, denoted 4. Obviously, loop freedom is guaranteed at all times in G if this graph
is always a directed acyclic graph, which we call thesuccessor directed acyclic graph (SDAG) of G for destination j and denote
it by S,(G). In steady state, when all routing tables are correct,
Sj(G) must be a tree.
The directed path from node i to node k in S,(G) at time t
is denoted by Pik(t). We say that node i is upstream of node k
in Sj(G) if the directed chain Pij from node i to node j includes
node k. Similarly, we say that node k, in this case, is downstream
of node i.
A loop can occur in G only when Sj(G) is modified for some
j E G after a change in the cost or status of a link. Furthermore, multiple such changes can occur. For each link-cost or
link-status change that affects Sj(G), we define an SDAG computation, which consists of obtaining the new SDAG formed by
the routing-table entries for node j at each node in the network.
Using Dijkstra and Scholten’s algorithm (DSA), a single SDAG
computation can be diffused to (i.e., propagated among and executed by) nodes along the SDAG using queries that contain the
news about the topological change; the diffused SDAG computation then shrinks along the same tree with the reception of replies
to those queries, which specify that the nodes sending the replies
have modified their routing tables according to the change(s) reported in queries. Initially, all nodes are passive, that is, not
engaged in a diffused SDAG computation, When a node detects
a change in the status or cost of an adjacent link, a diffused
SDAG computation commences with the node going into active
An alternative approach to the propagation of information
with feedback used by Merlin and Segall is to view distributed
loop-free routing as a problem of difising
computations. Dijkstra and Scholten [DIJK-801 introduced the concept of diffusing
computations to check the termination of a distributed computation, such that the process starting a computation is informed
when it is completed and such that there are no false termi-
213
state by sending a query to all its neighbors upstream in the
SDAG. After receiving and processing a query from its successor
toward the destination, a passive node forwards the query to its
neighbors upstream in the SDAG and goes into active state. A
node in active state cannot change the successor to the destination listed in its routing table (i.e., it cannot change the SDAG)
until it receives all the replies to its query; when this happens,
the node goes back to passive state and computes a new successor and distance. Accordingly, all the neighbors of an active
node must receive the new distance from that node to the destination and reply with their own new distances before the node
can recompute its distance and successor to the destination. This
means that a node can never choose a neighbor upstream in the
SDAG when it must change its successor to the destination. Because this adaptation of DSA supports only one diffused SDAG
computation at any time, it operates correctly only for the case
in which a single change occurs in a network that has a stable
topology.
tion for node j and sends a query to ail its neighbors. The node
cannot change its successor to node j until it has been told by all
its neighbors that they have processed its query and have either
obtained alternative feasible successors to node j, or determined
that they cannot reach node j. Jaffe and Moss’s bit vectors are
used to support multiple diffused SDAG computations concurrently.
The key advantage of using diffusing computations over the
approach used by Merlin and Segall is that the destination node
does not have complete control on the propagation of information, which accounts for a substantial reduction of communication and processing overhead.
The same basic algorithm can be implemented in a number of ways, depending on the type of network and operational
conditions in which the algorithm has to be used (e.g., packetradio networks, land-based networks with telephone lines). In
this paper, we describe DUAL at a level of detail that is independent of implementation
considerations, but is sufficient to
prove that DUAL is loop free and correct. We assume that either of two well-known algorithms is used at each node to compute shortest paths once topology information is gathered and
updated using DUAL-Bellman
and Ford’s distance-vector algorithm [FORD-621 or Dijkstra’s link-state algorithm [DIJK-591.
3
Distributed
Update
Algorithm
(DUAL)
From the previous section, we observe that all previous algorithms for loop-free routing use distance vectors. In this paper,
we present the distributed update algorithm (DUAL), which generalizes the algorithms based on diffusing computations outlined
in the previous section by separating the mechanism used for
diffusing the computation of an SDAG from the contents of the
nodal routing tables and topology tables.
Chandy and Misra proposed a distance-vector algorithm based
on an adaptation of DSA very similar to the one summarized
above [CHAN-821.
We point out that this adaptation works
correctly for a single computation only. Independently
of the
work by Chandy and Misra or Dijkstra and Scholten, Jaffe and
Moss [JAFF-821 showed that no routing loops can occur in a distributed implementation
of the Bellman-Ford routing algorithm
[FORD-621 after a link addition or a link-cost decrease. Using
this result, they proposed a loop-free distance-vector algorithm
that supports multiple, concurrent, diffused SDAG computations
for the same network destination, and which requires no internodal coordination unless link costs increase or resources fail in the
network. In the absence of link-cost increases or resource failures,
the Jaffe-Moss algorithm behaves like the Bellman-Ford routing
algorithm, that is, no diffused SDAG computation is used. Kowever, when a node detects an increase in the cost of alink, it starts
a diffused SDAG computation for each destination for which the
node’s distance was affected by the change. The new SDAG is
computed using an algorithm similar to the adaptation of DSA
described above. Multiple diffused SDAG computations are supported concurrently by means of bit oectors, maintained at each
node, which specify, for each neighbor and for each destination,
how many queries need to be answered by the neighbor node and
which one of such queries was originated by the node maintaining
the bit vector.
3.1
Messages
There are three types of messages in DUAL-updates,
queries,
and replies. Each message contains a type tag that specifies any
one of the three types of message and a message identifier that
uniquely identifies the message. In addition, updates and queries
contain an update list of one or more entries, each of whose contents is dependent on whether link states or distance vectors are
used. A reply is sent in response to all the update-list entries
in a query. In this work, we assume that updates, queries, and
replies are exchanged on an event-driven basis.
When distance vectors are used, an update-list entry contains,
as a minimum, the identifier of a destination and the current
shortest distance to it. In contrast, when link states are used, an
update-list entry consists of the identifier of a link and its cost.
3.2
Finally, the author presented a loop-free distance-vector algorithm similar to the Jaffe-Moss algorithm [GARC88a].
According to this algorithm, when a node needs to update its routing
table for a given destination j, after it processes an update message from a neighbor or detects a change in the cost or availability
of a link, the node tries to obtain a new feasible successor with
the shortest distance to node j. From the standpoint of node i,
a feasible successor toward node j is a neighbor node that has
reported a distance to the destination that is at most as large as
the distance reported by node i’s current successor before node i
decided to update its routing table. When feasible successors are
found, the algorithm behaves as the Bellman-Ford algorithm, or
else, if a node cannot find a feasible successor with the shortest
distance to node j, the node starts a diffused SDAG computa-
Information
at Each
Node
Each node in the network maintains a routing table, a link-state
table, a query-origin table, a reply-status table, and a topology
table.
The routing table of node i is a column vector of IN] row
entries, where N is the set of active nodes in the network. At a
given time t, the entry in node i’s routing table for destination
node j consists, as a minimum, of the identifier of node j, the
length of the current distance from node i to node j (denoted
by of(t)), the successor in the path currently chosen by node i
toward node j (denoted by s:(t)), and the length of the distance
from node sj to node j (denoted by D?(t)) before node i becomes
214
Definition
6 A node z is said to be the predecessor of another
node toward node j if the latter is the successor for node x toward
the same destination.
active. The successor is null when node i determines that node
j is unreachable.
The link-state table of node i is a vector of INil entries. Each
entry corresponds to the cost of a link adjacent to the node maintaining the table. The cost of the link from node i to neighbor
node L, as maintained by node i at time 2, is denoted by l:(t).
The cost of an inactive link is set to infinity (i.e., an arbitrarily
large value).
Definition
6 Node i is busy if it is active with respect to at least
one destination, and is idle otherwise.
In the following description of DUAL, we always refer to a
node’s state (active or passive) with respect to destination j,
and refer to node j, distance to node j, SDAG for node j, and
successor toward node j simply by destination, distance, SDAG,
and successor, respectively. Furthermore, we make reference to
a single entry of the routing table.
The query-origin table of node i is a vector of INI entries. At
time t, each such entry specifies the identifier of a node j and a
query-origin flag, denoted by o:(t).
The reply-status table of node i is an INI . INil matrix. Entry
(j, k), k E N;, consists of a reply-status flag, denoted by r&(l),
which specifies whether or not node i is waiting at time t for a
neighbor’s reply to a query.
3.3.2
3.3.1
Similarly, DUAL differs from the Gaffe-Moss algorithm in that
the transition from active to passive state in the latter is dictated
by whether or not a node detects an increase in its distance to
the destination, regardless of how its own distance compares with
its neighbors’ distances.
Operation
In Figure 1, we indicate the information received at a node,
the information sent by the node, and the updating of the reply
and query-origin tables, according to the state in which the node
is and the information it receives. However, we do not specify
how routing, link-status, or distance tables are updated.
Definitions
The operation
in DUAL
As shown in Figure 1, DUAL has two main states with respect
to each destination, just as DSA does: active and passive. The
main difference between DUAL and DSA is that the transition
from the active to the passive state is not immediate after the
detection of a change in the cost or status of a link. Rather, the
transition is dictated by whether or not a node has a feasible
successor. In addition, DUAL supports multiple diffused SDAG
computation, ensuring that each computation is atomic and is
processed in the proper sequence.
When distance vectors are used, each entry of node i’s topology table at time t consists of the identifier of a destination j and
the length of the current distance from neighbor node L to node
j, for each of node i’s neighbors (i.e., for all L 6 Ni), denoted by
Djk(t). In contrast, when link states are used, the topology table
consists of a link-cost matrix that contains an entry for each pair
of network nodes with the cost or weight of a link; the cost of a
link that does not exist is assumed to be infinity (i.e., arbitrarily
large). We note that node i can obtain D&(t) for any k E Ni
directly from the link-cost matrix.
3.3
States
of DUAL is based on the following definitions.
Definition
1 Node i is active with respect to destination node
j when it is waiting for replies from its neighbors referring to
node j, and is passive with respect to node j otherwise. If node
i is active with respect to node j, it cannot modify si in its
routing table; furthermore, node i’s distance must be set to Dj =
3.3.3
Multiple
Destinations
When distance vectors are used, each update entry of a query
or update refers to a specific destination, and a node receiving
a query with an update entry for destination j knows that the
sending uode is active with respect to node j. Accordingly, a
diffusing SDAG computation for destination j is completely independent from any other destination’s SDAG computation.
Definition
2 FC (Feasibility Condition):
Consider a passive
node i that has as its successor to node j node F at time t’ (i.e.,
sj(t’) = F) and Dj(t’) < 00. Consider also that, at time t > t’
node i detects a change in D&(t)) + $(t’).
Node i can choose
as its new successor to node j any neighbor q E Ni for which
+ l;(t) ( z E Ni} < 00
q*(t) + lip@) = Dmcn(t) = Min{Dj,(t)
and DjJt) 5 DjF(t’) < 00.
We also note that a node detecting a change in the status
or cost of a link may have to send both an update and a query
when its routing table is updated; the node sends the update
with its shortest distances to destinations for which it has feasible
successors, and sends a query regarding destinations for which the
node has no feasible successors. This query contains the distances
through the current “frozen” successor.
Definition
3 A feasible successor at time t for node i toward
node j is a neighbor node q E N; that satisfies FC.
On the other hand, when link states are used, the information
exchanged in updates and queries refers to link costs only. Therefore, when a node receives a query with an update entry for link
(2, y), it cannot determine the destinations for which the sending
node is active. Because of this, an idle node that becomes busy
(see Definition 6) cannot participate in any other SDAG computation until it becomes idle again. That is, a passive node that
Definition
4 The shortest distance for node i toward node j is
equal to Min{D&(t)
+ Z:(t) 1z E Ni}, where node z must be a
feasible successor. This entry is set to infinity (i.e., an arbitrarily
large value) when node i determines that node j is unreachable.
215
3.3.5
becomes active for at least one destination and sends out a query
must act as if it was active for all destinations when it receives
subsequent queries and updates, until it becomes passive again
for all destinations.
When the node receives the replies to its
query and becomes idle, it must recompute all its routing-table
entries.
3.3.4
Event
Handling
in Passive
Once active, node i stays that way until it receives the replies
from all its neighbors. When node i is active and receives a reply
from node k, it sets $, = 0; therefore, node i becomes passive
whenr&=OforallkENi.
Sand update if needed.
\
/
I
Raceiva link than
or update and fe&ible
successor not found.
setd
&I
- 1 for all neighbors.
Become passive again (o,’ - 2).
no feasible su~~tssor exists.
/
Send update if needed.
send reply.
3.3.6
Receive last reply from k
and be so”rc8 of query of
quety pending.
(0,”
Link (i.k) fails, k is successor,
ri islastl.
-------
I
Set 0; - 3, send query.
set rr’ - 1 for all neighbors.
\
Link (i.k) fails or receive
reply from k. more
replies pending. K is not
1 1
\\I
I
I
Send re~lv to succ8ssor.
SUCCBSSO,.
(not from suaessor)
set r; 5 0
Send reply.
set r; =o.oii=2
to Passive
State
Finally, if 0: = 2, then node i attempts to find a neighbor
that satisfies FC using Dr (i.e., the distance through its old
successor, used when node i became active in the first place). If
a feasible successor is found, then node i sets 0: = 0, recomputes
its shortest path and successor, updates its routing-table,
and
Receive query frc
Receive
update.
Active
If 0: = 3, node i follows the same steps as for the previous
case; however, before node i modifies its successor, it sends a
reply to that neighbor.
Link (i.k) fails, k is
s”ccBssof. more
replies pending.
from
If 0: = 1, then node i sets o$ = 0, recomputes its shortest
path and successor and updates its routing-table.
If distance
vectors are used, node i sends out an update if any change was
made to l?j. If link states are used, node i sends an update with
the contents of its link-update list, unless it is empty.
---r I -.-..- .ly relay (Cl)’ - 3).
1,
Transition
There are three different courses of action that node i can take
when it receives the last reply to its query and becomes passive.
1a
set,‘-0.
ti
not
Receive query and
feyibfe swxessor
*XM.%
/
\
P 1.
Receive query and
feasible successor
found.
In the case of distance vectors, because an update or query
contains routing information that must be modified at every hop,
each needs to traverse a single hop only.2 The situation is very
diffeiTent when link states are used, however. In this case, an
update or query contains topology information that must be the
same in every node’s topology table; accordingly, the same update must be propagated to all network nodes. However, because
no new topology information should be sent or processed by an
active node, an active node stores all the changes in link costs
detected or received in messages in a temporary list called the
link-update list; this information is processed when the node becomes passive again. Furthermore, additional mechanisms must
be used with link states to ensure that the same update does not
traverse the same nodes or links an unnecessary number of times;
we assume the existence of such mechanisms in this paper.3
Become passive qain
(O/-2),
fessible 8uccessoT exl818.
Send update if needed.
State
If node i is active and receives a query from a neighbor k
other than its successor, it updates its topology table and remains
active. In addition, it sends a reply to node k. No other action is
needed. On the other hand, if the query comes from its successor,
then node i must be the origin of the previous query, for which
node i is still waiting for replies; accordingly, node i sets oj = 2
after updating its topology table and remains active.
When node i is passive and cannot find a feasible successor
after processing an input event, then it becomes active (see Figure 1) by setting r$ = I for zdl k f Ni, and sends out a query to
all its neighbors with the value of Dj = Djb + If, where s = sj.
If node i becomes active because of an update, a link failure, or
a change in the cost of a link, then it sets oj = 1. Alternatively,
if it becomes active because of a query from a neighbor, node i
sets 0: = 3.
Send update if needed.
in Active
If an active node i receives an update or detects a change in
the cost of a link, then node i simply updates its topology table
(and its link-status table if needed) and remains active.
State
Become passive again.
(0; D 3 or 1)
Handling
Updates are distributed differently across the network depending
on whether distance vectors or link states are used, However,
whether link states or distance vectors are used, once an active
node sends a query to its neighbors, it cannot send any update
or query until it receives sJl the replies to the query in progress.
When node i is passive and receives an update, a query, or a
change in the cost of a link, it updates its topoIogy table and,
in the case of a link change, its link-status table. Using the
new information in its topology and link-status table, node i
attempts to find a feasible successor (as defined in Definition 3).
If a feasible successor is found, node i updates its routing table,
remains in the passive state and, if needed, sends out an update
to ail its neighbors with the new topology or routing information
(see Figure 1). If the input event was a query from a neighbor,
node i sends a reply to that neighbor. Node i is free to process
other input events (updates, queries, or changes in link costs or
status) immediately after sending the update.
Receive link change or
update and feasible
sucmssor exists.
Event
lir lk-cost change or
‘The reader is referred to a number of previous publications on the subject
[JAFF-821 IJAFF-861 [GARC-88a] [GARC-891.
3The reader is referred to previous publications for a detailed description
of these mechanisms [JAFF-861 [MCQU-SO].
Figure 1: Active and Passive States in DUAL
216
sends a reply to its old successor if the link with that neighbor
still exists. Node i sends out an update if any change was made
to Dj and distance vectors are used. On the other hand, if no
feasible successor is found, node i sets o$ = 3 and T$. = 1 for
all Ic E N; and sends another query, which is the query that
came from its successor after node i had become active. When
distance vectors are used, the new query specifies Dj = DfF + 1;;
alternatively, if link states are used, node i’s query contains the
contents of its link-update list.
(2)
(1)
a
d
b
f
(0)
(3)
C
e
a
(2)
(1)
Figure 2: Network Used in Examples
X3.7
Link
Failures
and Additions
Special measures must be taken for the processing of link failures
and additions, to ensure that a node i does not wait forever
for a reply from a neighbor and uses no upstream neighbors as
successors.
-
i
When node i is passive and detects that link (i, Ic) has failed,
it updates its link-status and topology tables with 16 = 00; if
distance vectors are used, it must also set Dj, = co. After that,
node i carries out the same steps used for the reception of an
update in the passive state.
a
When node i is active and detects the failure of link (i, Ic) and
Ic # s$, it updates its link-status and topology table accordingly,
and sets T& = 0, to account for the fact that node i should not
expect a reply from node Ic. No other action is needed, unless
node i becomes passive after setting r$ = 0.
e
b
C
d
-
-
i
0
a
1
b
1
C
2
d
3
e
';b
1
f
2
f
-
-
64
On the other hand, when node i is active and detects the
failure of link (i, k) and Ic = sj, node i updates its link-status
and topology table accordingly, and assumes that its successor
replied to its query (i.e., node i sets T& = 0). In addition, node i
must make sure that, when it receives all the replies of the query
for which it is active, it will not choose a neighbor upstream by
mistake. To achieve this, node i remembers the distance from
its successor to the destination when node i became active in the
first place (Of), and sets 03 = 2. No other action is needed,
unless node i becomes passive after setting T$, = 0.
Routing Table
-
-
i
Topology Table
-
-
S;
a
d
a
-
-
-
e
f
00
00
1
00
00
C
1
00
1
d
00
1
1
e
00
0
1
00
1
0
a
b
When node i is active and detects the establishment of link
(i, k), it updates its link-status and topology tables as needed.
Node i sets TL~ = 0 and Dik = 00 for all m E N, to account for
the fact that node R does not owe any reply to node i and has not
reported any distances since the new link was established. After
that, node i sends an update to node k that contains node i’s
entire routing or topology table, depending of whether distance
vectors or link states are used.
4
I);
1
Topology Table
Routing Table
-
0
b
f
-
00
-
W
Figure 3: DUAL Tables at Node a
In the two examples presented in this section, we consider
the case in which the link (f, e) fails. We note that there are
two queries originated by this failure-one
by node f and one
by node e. However, to compare the behavior of DUAL when
distance vectors and link states are used, we focus our attention
on the query originated by node f, and quantify the number of
steps and messages that it takes for all the nodes to recompute
the successor tree for node e.
Examples
In this section, we present examples of the way in which DUAL
operates after resource failures or link-cost increases. Figure 2
depicts a small network with six nodes. Links are assumed to
have unit cost. An arrowhead in the directed link from node z to
node y in the figure indicates that node y is the successor toward
node e (i.e., S: = y). The label assigned to node z indicates the
value of 0,“.
Figure 4 shows the changes in the routing-table entry for node
e at each node when distance vectors are used. An update is
indicated by an arrow adjacent to the link where it is transmitted,
followed by a “U”. Replies and queries are similarly identified
with an “R” and a “Q”. We assume that a reply contains an
update list with updated routing-table entries.
Figure 3(a) illustrates the routing table and topology table
maintained at node a when distance vectors are used. Figure 3(b)
illustrates the same tables when link states are used.
217
After node f detects the failure of link (f, e), it determines
that it has no alternate feasible successor to node e (i.e., D,f, >
DL and DL > DL). Accordingly, it becomes active by setting
rf, = rid = 1 and r,f, = 0 (the last flag is set to 0 to account for
the link failure). Node f also sets o{ = 0 to account for the fact
that it does not have to send a reply to node e when it becomes
passive again. When node e becomes active, it sends a query to
its neighbors (nodes d and c). We note that the query originated
by node f needs only to contain an entry for node e, because the
failure of link (e, f) does not affect the distance from node e to
the rest of the other nodes.
=-L \’
NOT’CONSIDERED
<
-
Node d simply sends a reply to node f assoon as it processes
its neighbor’s query, because its feasible successor (node e) is
not affected. Node c also sends a reply to node f, because it
is able to find a new feasible successor to node e (D&, 5 D&).
Accordingly, nodes c and d remain passive. When node f receives
the two needed replies to its query, it knows that it does not have
to send a reply to its current successor (node e), because o! = 0.
It then changes its successor to node e, and sends an update to
nodes c and d. The algorithm terminates when these two nodes
update their topology tables using node f’s update.
W
(1)
We want to point out that, because of the connectivity among
nodes e, f, and d, the query originated by node e would likely
refer only to node f, and would not affect the routing table of
node d.
(2)
W
Figure 4: Example of DUAL
Operation
Figure 5 shows the same example when link states are used.
In Figure 5, the queries and updates specify an infinite cost for
link (e, f). We assume the use of an update forwarding mechanism similar to the one used in the new ARPANET
routing
algorithm [MCQU-SO] [SCHW-861, whereby an update needs to
flow on each link only once in each direction, and a node never
forwards an update to the node from which it came.4
with Distance Vectors
Notice that the only difference in the two cases is the fact
that the updates originated by node f and node e propagate to
all network nodes. We point out that the query originated by
node e would cause the same number of updates created by the
query originated by node f.
5
CONSIDERED
Q
A Sufficient
Condition
for Loop
Freedom
Using Dijkstra and Scholten’s diffusing computations technique
can be applied to achieve loop freedom in distributed routing
algorithms.
However, it introduces additional communication
overhead, because, when a node needs to modify its routingtable entry for a given destination, a query must be propagated
(1)
03
Operation
of DUAL
Throughout this section, we refer to a node’s state (active or
passive) with respect to destination node j, and refer to node
j, distance to node j, SDAG for node j, successor toward node
j, and path to node j, simply by destination, distance, SDAG,
successor, and path, respectively.
Furthermore, we denote by
p;(t) node z’s preferred path to destination j at time t, and by
P!‘(t) the path from node x to destination j known by node y
at time t.
5.1
Figure 5: Example of DUAL
Correctness
‘Although such a mechanism provides substantial savings in communication overhead, it adds substantial complexity in the implementation of the
algorithm [PERL-831 [JAFF-861.
with Link States
218
If a loop is to be formed at time t because of node i’s action,
P,“(t) must include P:(t).
If Pia
= 4(t), it follows that
Djo(t) = DT(t’) = D&(t) > D&(Y), because no loop exists before
time t and all link costs are positive. This, however, contradicts
FC*, and so it follows that FC*is sufficient for this case.
to all nodes upstream of that neighbor in the destination’s SDAG.
Hence, the question arises as to whether it is possible for nodes to
modify the SDAG of a given destination without any internodal
coordination, and still maintain loop freedom at every instant.
Consider a network G’ with an arbitrary topology; and assume that an arbitrary link-state or distance-vector algorithm is
used, such that (1) there is no internodal coordination for the
updating of routing tables and (2) each update is processed independently of others. The following theorem demonstrates that
FC*, as defined below, is sufficient to ensure loop freedom at
every instant in G”.
Consider now the case in which PT(t’) # P;“(t). Let sf(tl) =
old and let us assume that node x changes its successor exactly k times (where k 2 1) in the time interval (tl, tz), where
newk are node x’s consecutive successors within
newl, news,...,
that time interval, and newk = sT(tz). Then, FC* requires that
the following inequalities be satisfied:
Definition
7 FC*(Feasibility
Condition with no Internodal Coordination): Consider a node i that has as its successor to node j
node F at time t’ (i.e., s$(t’) = F) and Dj(t’) < 03. Consider also
that, at time t > t’ node i detects a change in D&(t’) + l$(t’).
Node i can choose as its new successor to node j any neighbor
q E N; for which L$(t) + Z:(t) = Min{D~,(t)
+ Ii(t) 1 z E
N; and D:,(t) 5 DiF(t’)} < co.
. . . I D;newl(tnew,)i D;odh),
wheretnew,< tnew,+,is
new,
the time when node z makes node
its successor.
Accordingly, because alI link costs are positive, we can traverse the directed path P;“(t)c P:(t) at time t and find that the
following inequalities must hold:
1 Zf Sj(G’) is loop-ffee, no loop can be created by
increasing any of its nodes’ distances while maintaining the same
successors. 0
Proposition
DjJt)
= Dja(t’) > D;+,,d)(t’)
The reason for this is that S;(G*) does not change.
2 Zf a loop Lj(t) is formed in Sj(G*) for the first
time at time t, then some node i E Lj(t) must choose an upstream
node as its successor at time t. q
Proposition
This is evident from the fact that Sj(G*)
acyclic before Lj(t) is created.
Theorem
suficient
is directed
D”(k-l,new)
Js(k,old)
and
b(k,
old))
1
D$$;$‘)(t)
>
Djd(a(k~~)old)(t~(k+l,old))
1 Using FC* when nodes choose their successors is
to ensure that Sj(G*) is loop free at every instant. 0
D?!“(t)
3%
Proof: Note that, because we can assume any distance-vector
or link-state algorithm, we can assume that each node in G*
knows an entire path from any other node to the destination.
However, this information may be out of date; this forms the
basis of our proof, which is by contradiction [GARC-88a].
where
> Di32(tz) 2 Di3a(t) ’
g+lvne4
3 d(k,newJ (t) indicates that node s(k - 1, new) is the kth
hop in the path P:(t) at time t and has node s(k, new) as its
successor in the path. Similarly, D$;~~~~‘(&ck,
&)) indicates
that node s(k- 1, new) had node s(k, old) as its successor at time
h(k,
old) < t, where
t,(k,
old) is the time at which node s(k- 1, new)
updated its distance table for the last time prior to time t because
of an update from its neighbor s(k, old). Lastly, so”’
= i,
s$)
= x, and t, < t.
Assume that, before time t, Sj(G*) is loop free at every instant. Furthermore, assume that a loop Lj(t) is formed in Sj(G*)
at time t. From propositions 1 and 2, we know that at least one
node must change its successor at time t and choose an upstream
neighbor for a loop to be formed. Therefore, we only need to
show that FC* is sufficient to ensure loop freedom when at least
one node in G* changes its successor at time t.
Because these inequalities lead to the erroneous conclusion
that D&(t) > D:,(t), it follows that no loop can be formed in
Sj(G*). Therefore, it follows that FC*is sufficient in this case
too.
Let node i be the node that creates Lj(t) when it makes
node a its new successor at time t, when it detects an increase
in Dj = Df* + ld, where s = sf # a. FC* requires that DJ(t’) =
of,(t) 5 D$(t”) = Di(t” - Q). In this expression, t’ < t is the
time of the last update from node a = s:(t) that is processed by
node i prior to time t. Similarly, t” - Eb is the time of the last
update from node b = si(t”) that is processed by node i at time
1” < t.
The operation of any routing algorithm is such that, when
the network is first initialized, all each node in G’ knows is how
to reach itself; this is equivalent to saying that a node has a
routing-table
entry for each of the other nodes in the network
with infinite distance and no successors to them. Hence, at time
0, Sj(G*) is a disconnected graph of one or more components,
each with a single node, and must be loop-free. Therefore, we
conclude that FC*is sufficient. 0
219
There are three types of action that node i can take when it
becomes passive, depending on the value of o$. If o: = 2, then
node i cannot change its successor, unless it finds a neighbor
that satisfies FC using the distance from its successor to the
destination before node i became active (0:).
However, this is
equivalent to the third case above, for which we showed that no
loop can be created.
Although FC* ensures loop freedom at every instant, it does
not guarantee shortest paths in the resulting SDAG. To obtain
loop freedom at every instant and shortest paths in the SDAG of
each destination, DUAL modifies FC*, as defined in FC. When a
node i attempts to update its successor and FC is not satisfied,
DUAL uses diffused computations to distribute the new path
length perceived by node i through its current successor to all
upstream nodes, forcing node i to learn all new distances from
upstream neighbors to the same destination before it is allowed
to change the SDAG.
5.2
If O$ equals 1 or 3 and node i becomes passive at time t,
then node i can create no loop if either node b remains node
i’s successor or node i has no successor. Therefore, if a loop is
to be created by node i at time t, it must occur when node i
chooses a new successor without using FC after receiving the
last reply to its query or detecting the failure of the link with
the neighbor (different from node b) whose reply was pending.
However, it follows from the correctness proof of DSA [DIJK-801
[CHAN-821 that node i cannot receive all the replies it needs from
its neighbors before each one of its upstream neighbors forwards
node i’s query to all its neighbors and receives the corresponding
replies from them, including node i. Furthermore, an active node
cannot send new routing or topology information in an update
or a new query, until it receives the replies to the query for which
it is active. Accordingly, when node i gets the replies from all its
neighbors, each must either not be upstream of node i any more
or must have reported to node i a distance to the destination
that includes its distance to node i. Therefore, no loop can be
created in this case.
Loop F’reedom in DUAL
Because a routing-table loop is created with respect to a particular destination, and because nodes coordinate with respect to
the same SDAG, we can focus on a single SDAG to show that
DUAL is loop-free at every instant.
We refer to node i and the set of nodes upstream of node i
that become active because of a query originated at node i as
active SDAG of node i, and denote it by Sji(G).
Lemma 1 Consider a network G in which DUAL is executed. If
only a single difised SDAG computation takes place at any one
time, then Sj(G) is loop-free at every instant. 0
Proof:
It follows from the above that no loop can be formed when
Sj(G) is loop-free before time t and a single diffused SDAG computation occurs. However, as pointed out in the proof of Theorem 1, Sj(G) must be loop-free at time t = 0, and so the lemma
is true. 0
By contradiction.
Assume that Sj(G) is loop free up to time t and that node i
is the first node in G to create a loop in Sj(G) at time t, after
processing an input event. Because an active node cannot change
its successor after processing an input event, unless it becomes
passive, it follows that, for node i to create a loop at time t, it
must either be passive or become passive at time t. Let node b
be sj when node i decides to change its successor at time t.
Lemma 2 DUAL considers each diffused SDAG
individually and in the proper sequence. 0
Proof:
Assume that node i is passive at time t, when it processes an
input event. According to DUAL, D?(t)
= D$,(t) and node i
can carry out one of the following actions:
1. Determine
computation
Consider the case in which the network has a stable topology
and node i is the only node that can start diffused SDAG computations after detecting changes in the cost or status of adjacent
links. If node i generates a single diffused SDAG computation,
it follows from the previous lemma that every node in Sji must
process node i’s query within a finite time and in the proper
sequence, because there is only one diffused computation.
that node b satisfies FC.
2. Determine that none of its neighbors satisfies FC and become active.
3. Find a neighbor a # b that satisfies FC.
Assume that node i can generate multiple queries. Whether
distance vectors or link states are used, an active node cannot
send a new query or update before it receives all the replies to
the query for which it is active. Therefore, because all the nodes
in S+(G) process each input event in FIFO order and all links
transmit in FIFO order, it follows that all the nodes in Sji(G)
must process each diffused SDAG computation individually
and
in the proper sequence.
In the first case, node i cannot create a loop at time t, because
no link is changed in Sj(G). In the second case, node i cannot
create a loop at time t, because it must keep the same successor
and cannot, therefore, modify Sj(G). In the third case, either
all the nodes in the new path from node i to node j (P:(t))
are
passive, or at least one node in that path is active. It follows
from the proof of Theorem 1 that node i cannot create a loop at
time t when it switches successors according to PC if all nodes in
P;(t) are passive. On the other hand, any active node 2 E Pj(t)
in the only diffused SDAG computation in G must become active
without changing its successor, which had to be chosen according
to FC. Accordingly, if we traverse P,‘(t) C Pj”(t) at time t, we
arrive at the erroneous conclusion that D;,,(t) > o&(t), just as
we did in the proof of Theorem 1.
Consider now the case in which multiple sources of diffused
SDAG computations exist in the network, but no topological
changes can take place. If any two of the active SDAGs of G have
an empty intersection, then we can conclude from the previous
case that the lemma is true. Hence, in the absence of topological
changes, incorrect treatment of queries can only originate at a
node that belongs to an intersection of active SDAGs, i.e., one
that is waiting for replies to two consecutive queries. However,
this is impossible by design.
220
that at least one of its upstream neighbor nodes, ur , never sends
a reply for node i’s query. In turn, this implies that at least
one of node ur’s upstream neighbors, us, never sends a repIy
for node ~1’s query. If we pursue the same line of argument, it
follows that, because Sj(G) is finite and loop-free at every instant
of time (as shown before), there must be a node u, that never
leaves the active state (i.e., never unfreezes sr=(t”), t” > t’), such
that no node in G is upstream of it in Sj(G), which we have
shown above to be impossible. Therefore, because SDAGs are
computed independently for each destination, a finite time after
t, no node can be active.
Finally, consider the possibility of topological changes-link
or node additions and deletions. Because the addition and deletion of nodes are handled by DUAL as multiple link additions or
deletions, we only have to consider link changes in our proof.
For a given destination, a node can become active in only
one diffused SDAG computation at a time, and can, therefore,
expect at most one reply from each neighbor. Accordingly, when
an active node loses connectivity with a neighbor different from
its successor while it is expecting a reply from it, the node can
simply assume that its neighbor sent a reply indicating an infinite
distance. In addition, when an active node i loses connectivity
with its successor, it assumes that its successor sent its reply and
sets oj = 2. So, when node i becomes passive again, it attempts
to find a neighbor that satisfies FC using the distance reported
by its old successor before node i became active in the first place
(i.e., Dy). Accordingly, the FIFO order in which node i processes
queries does not change with link failures.
Next, we observe that the updated value of a link cost is
broadcast to every node in the network within a finite time after
the link changes its cost, because no node can be active forever,
because we assume a reliable flooding mechanism of messages
when nodes are passive or become passive, and because a node
that becomes passive processes all link-cost updates in its linkupdate table. Hence, a finite time after the occurrence of all
topological changes, every node in the network has an up-to-date
link-cost matrix.
Because all nodes must become passive and
have up-todate information in their link-cost matrices a finite
time after t, and because each executes a finite-time, shortestpath-first algorithm, it follows that the theorem is true. q
On the other hand, when node i is active and establishes connectivity with a node k, it treats the new neighbor as one that
already sent its reply to the pending query. It thus follows that
the FIFO order in which node i processes queries does not change
with the establishment of new links or link failures. Hence, because the occurrence of topological changes does not change the
order in which a node processes or forwards queries, it follows
that the lemma is true. q
Theorem
2 Consider a network G in which DUAL
Sj(G) is then loop-free at every instant. 0
Proof:
5.3
The proof follows directly
Convergence
6
Algorithm
Complexity
is ezecuted.
In this section, we analyze the complexity of DUAL when distance vectors and link states are used, and compare these two
cases with each other and with traditional distance-vector and
link-state algorithms.
For this comparison, we assume that all
the algorithms under consideration behave synchronously, so that
every node in the network executes a step of the algorithm simultaneously at fixed points in time. At each step, a node processes
a single input event (a message from a neighbor node or a change
in link status or cost). The first step occurs when at least one
node detects a topological change and issues updates or queries
to its neighbors. During the last step, at least one node receives
and processes updates or replies from its neighbors, after which
all routing tables are correct and nodes stop transmitting update
messages until a new topological change takes place.
from lemmas 1 and 2. 0
of DUAL
We have shown that DUAL provides loop-free routes at every
instant. However, we have not yet shown that it converges (i.e.,
that every node stops sending update messages and all routing
tables contain correct values afterwards) a finite time following
an arbitrary sequence of topological changes in the network. We
consider the convergence of DUAL in a network G that is in
equilibrium at time 0 and is subject to a finite number of arbitrary topological changes in the time interval (0, t), with no more
changes taking place afterward.
Our comparison is made in terms of the number of steps and
number of messages required for each algorithm to converge when
a single link failure occurs after all the network nodes have obtained optimal paths to every destination. We refer to the number of steps required by an algorithm as its time complezity (or
TC), and to the number of messages it requires as its communication complezity (or CC). We also consider the storage space
required by each algorithm, which we calI storage complezity (or
SC).
Theorem
3 Consider network G in which DUAL is executed.
A finite time after time t, no new update messages are being
transmitted or processed by nodes, and all entries in all routing
tables are correct. q
For the case in which distance vectors are used,
Proof:
the reader is referred to the convergence proof of the loop-free,
distance-vector routing algorithm presented elsewhere by the author [GARC-88a].
When distance vectors are used, DUAL has SC = O(]N] .
D), where IN] is the number of network nodes and D is the
maximum degree of a node, and has the same worst-case time
complexity as the Jaffe-Moss algorithm, which is TC = O(z),
where z is the number of nodes in an active SDAG. To verify this,
we note that, in the worst case (e.g., the failure of a destination
node) all nodes upstream of a destination node i in Sj(G) must
become active. DUAL requires O(Z) messages to converege when
distance vectors are used [JAFF-821. Traditional distance-vector
algorithms have TC = O(]N]) and CC = O(]N12) [SCHW-861,
and similar SC than DUAL.
For the case in which link states are used, we first observe
that, according to DUAL’s operation and our assumption that
messages are transmitted reliably, a node e E G for which no
node in N, is upstream of it in Sj(G) must receive all the replies
from its neighbors within a finite time after it sends a query. Let
us now assume that there is a node i that never becomes passive
after becoming active (when it freezes si(t’), t’ > 0). This implies
221
initiated downstream of it, and has received the replies for all
the queries initiated by it. Unfortunately,
this technique fails
to improve performance in all cases in which multiple resource
failures occur and transmission times vary. The reason for this
is that a node (call it n) in the wait state that detects the failure
of the link with its successor must assume that its successor has
sent its reset; otherwise, the node would be waiting for that reset
forever. Hence, even if node n sends another query to account
for the link failure and does not send a reset until it receives
the replies for the new query, all of this occurs asynchronously
from node n’s old successor. Accordingly, when node n becomes
passive and propagates the new reset message upstream, this
may cause a node upstream of n to choose as its new successor
a neighbor that has not heard the query originated downstream
from node n that made that node become active in the first place.
On the other hand, when link states are used, DUAL has
SC = O(]N]‘), because every node maintains a link-cost matrix.
Typically, D << IN], and storage requirements are much larger
when link states are used. Because any change in the cost or
status of a link must be propagated to all the nodes when link
states are used, DUAL has TC = O(y) in this case, where y is
the maximum of z, the number of nodes in the active SDAG, and
d, the diameter of the network. For the same reason, DUAL has
CC = O(]E]) (although additional queries and replies must be
transmitted in addition to the flooding of updates, this requires
O(z) messages and typically O(]E]) > O(z)). Traditional linkstate algorithms have TC = O(d), and similar SC and CC than
DUAL.
In a typical packet-switching
network in which there is more
than one path from a node to each destination, DUAL can perform much better with distance vectors than with link states after
three types of topological changes-link
failures that do not partition the network, link cost changes, and link additions. The
reason for this is that updates travel only as far as necessary
when distance vectors are used. No more than one update may
be needed in this case, while at least IN] messages have to be
exchanged when link states are used. Furthermore, nodal coordination in DUAL is much stronger when link states are used,
because a node must carry out the computation of the SDAGs
for all destinations in parallel after a topological change occurs.
7
Another way of enhancing DUAL’s performance is by means
of probe messages used only after link failures occur. According to
this approach, a passive node i that detects the failure of its link
to a destination or receives an update from its successor reporting
an infinite distance sends a probe message to the destination if
and only if it has a neighbor n that satisfies FC. Node i uses node
n as its new successor if it receives a positive answer message
from the destination.
Node i forwards the query if it receives
no answer message within a hold-down time (proportional to an
estimated transmission time per link times twice the distance
to the destination through node n) or receives an update from
node n reporting an infinite distance to the destination.
The
same procedure is followed by an active node k with an infinite
distance to the destination that receives the last reply to its query
and has at least one neighbor that has reported a finite distance
to the destination.
A probe message asks a destination for an
answer stating its existence.
Enhancements
The algorithm that we have presented in Section 3 is not meant
to be very efficient in terms of the number of messages exchanged
among nodes to update their routing tables. The reason for this
is that we wanted to describe DUAL independently of the type
of information stored in topology tables. A number of enhancements can be made to DUAL to make it more efficient. For instance, when distance vectors are used, the number of messages
exchanged after a link-cost increase or a resource failure can be
reduced by allowing each message to contain update, query, or
reply entries, each with an updated distance to a destination.
Similarly, for the case of link states, we have assumed that a node
must be passive with respect to all destinations (i.e., idle) to be
allowed to change its successor to a given destination without
coordinating with other nodes. Less restrictive link-state algorithms can be designed by requiring nodes to keep information
as to which routing-table entries are affected by which queries.
Using probes reduces the time complexity of DUAL to O(h)
after a single link or nodal failure. The reason for this improvement is that a node does not change its successor after it reaches
an infinite distance, unless it receives a positive answer from a
this technique fails
neighbor that satisfies FC. Unfortunately,
to enhance the performance of DUAL when a node i receives a
positive answer from one of its neighbors n and then a link in the
path from n to the destination fails. This technique is based on
the algorithm proposed by Naylor [NAYL-751 for loop-freedom,
in which a loop-check message is sent out by a node every time
it has more than two neighbors and must change its successor.
However, in the same way that probes fail to enhance DUALS
performance in some cases in which link failures occur, Naylor’s
algorithm fails to eliminate looping or counting-to-infinity.
More importantly, we note that the performance of any routing algorithm based on diffusing computations can deteriorate
dramatically in the case of node failures or network partitions, in
which all the remaining nodes in the network must be involved
in a diffusing computation.
Jaffe and Moss[JAFF-821 have suggested the addition of a wait state to improve the performance
of their algorithm. With this added state, when an active node
receives all the replies to its query, it waits for a reset from its
successor. When the node that initiated the diffused SDAG computation receives all the replies to its query, it sends out a reset
to its neighbors and becomes passive. In the absence of link or
nodal failures, this enhancement reduces the time complexity of
the Jaffe-Moss algorithm to O(h), where h is the length of the
longest chain of the set of nodes participating
in the diffused
SDAG computation. The same improvement would be obtained
in DUAL. For the case of multiple failures, a node remains in wait
state until it receives the reset messages for each of the queries
8
Conclusions
This paper shows that developing solutions for looping problems
of traditional distributed routing algorithms can be done independently of whether link states or distance vectors are used. In
the preceding sections, we presented a new approach for loopfree routing in computer networks that can be applied to either
distance vectors or link states. This approach is based on the
concept of diffusing computations, first introduced by Dijkstra
and Scholten.
We presented a new loop-free routing algorithm (DUAL) and
verified that it provides loop freedom at every instant and correct routing tables within a finite time after a sequence of topo-
222
[GARC-88b]
J.J. Garcia-Luna-Aceves,
“Routing Management in
Very Large Scale Networks,” in Future Generation
Computer Systems, Vol. 4, No. 2, September 1988,
pp. 81-94.
[GARC-891
J.J. Garcia-LunaAceves,
“A Minimum-Hop Routing Algorithm Based on Distributed Information,”
in Computer Networks and ISDN Systems, NorthHolland, Vol. 16, No. 5, May 1989, pp. 367-382.
[JAFF-821
Our new approach is important for very large networks and
internetworks, in which the consequences of the looping problem
would be unacceptable. In such environments, our new approach
should be used in combination with a hierarchical network organization to minimize routing overhead [JAFF-881 [GARC88b].
J.M. Jaffe and F.M. Moss, “A Responsive Routing Algorithm for Computer Networks,” in IEEE
Transactions on Communications,
Vol. COM-30,
No. 7, July 1982, pp. 1758-1762.
[JAFF-861
Finally, although we have adopted diffusing computations as
the mechanism to ensure the termination of distributed computations, the reader is reminded that there are other techniques
proposed in the literature (e.g., time stamping) to handle the
problem.
J.M. Jaffe, A.E. Baratz, and A. Segall, “Subtle Design Issues in the Implementation
of Distributed,
Dynamic Routing Algorithms,,, in Computer Networks and ISDN Systems, Vol. 12, No. 3, 1986, pp.
147-158.
[JAFF-881
J.M. Jaffe, “Hierarchical Clustering with Topology
Databases,‘, in Computer Networks and ISDN Systems, Vol. 15, pp. 329-339, 1988.
[MC&U-801
J. McQuillan,
et al., “The New Routing Algorithm for the ARPANET,”
in IEEE Tmnsactions
on Communications, Vol. COM-28, May 1980.
[MERL-791
P.M. Merlin and A. Segall, “A Failsafe Distributed
Routing Protocol,” in IEEE Transactions on Communications, Vol. COM-27, No. 9, September 1979,
pp. 1280-1288.
[NAYL-751
W.E. Naylor UA Loop-Free Adaptive Routing Algorithm for Packet Switched Networks,,, Proceedings
of the Fourth Data Communications Symposium,
Quebec City, Canada, October 1975, pp. 7.9-7.15.
[PERL-831
R. Perlman, “FauIt-Tolerant
Broadcast of Routing
Information,” in Proceedings IEEE INFOCOM ‘83,
San Diego, California, 1983.
[SCHW-861
M. Schwartz, Telecommunication Networks: Protocols, Modeling and Analysis, Chapter 6, Menlo
Park, California: Addison-Wesley Publishing Co.,
1986.
[SEEG-861
J. Seeger and A. Khanna, “Reducing Routing Overhead in a Growing DDN,” in Proceedings MILCOM
‘86, Monterey, California, October 1986, pp. 15.3.115.3.13.
logical changes. We compared the complexity of the algorithm
when link states and distance vectors are used, and showed that
using distance vectors for shortest-path computation in DUAL
is better than using link states, insofar as the algorithm’s complexity is concerned. DUAL compares favorably with respect
to traditional distance-vector and link-state algorithms and all
previous loop-free routing algorithms reported in the literature
[JAFF-821 [MERL-791 [GARC88a].
In addition, it requires less
storage overhead than the Jaffe-Moss algorithm, because it does
not require the storage of bit vectors to handle multiple diffusing
computations.
References
[CHAN-821
K.M. Chandy and J. Misra, “Distributed
Computation on Graphs: Shortest Path Algorithms,,,
in Communications of the ACM, Vol. 25, No. 11,
November 1982, pp. 833-837.
[CEGR-751
T. Cegrell, “A Routing Procedure for the TIDAS
Message-Switching
Network,” in IEEE Tmnsactions on Communications,
Vol. COM-23, No. 6,
June 1975, pp. 575-585.
[DIJK-591
E.W. Dijkstra, “A Note on Two Problems in Connection with Graphs,” in Numerische Mathematik,
Vol. 1, pp. 269-271, 1959,
[DIJK-801
E.W. Dijkstra and C.S. Scholten, “Termination Detection for Diffusing Computations,,’
in Information Processing Letters, Vol. 11, No. 1, August 1980,
pp. l-4.
[FORD-621
L.R. Ford and D.R. Fulkerson, Flows in Networks,
Princeton, New Jersey, Princeton University Press:
1962.
[GARC-87]
J.J. Garcia-LunaAceves,
“A New Minimum-Hop
Routing Algorithm,”
in Proceedings of IEEE INFOCOM ‘87, San Francisco, California, April 1987.
[GARC-SSa]
J.J. GarciaLuna-Aceves,
“A Distributed
LoopFree, Shortest-Path Routing Algorithm,,, in Proceedings IEEE INFOCOM ‘88, March 1988.
223
[SHIN-871
K.G. Shin and M. Chen, “Performance Analysis of
Distributed Routing Strategies Free of Ping-PongType Looping,” in IEEE Tmnsactions on Computers, Vol. COMP-36, No. 2, February 1987, pp. 129137.
[STER-801
T.E. Stern, “An Improved Routing Algorithm for
Distributed
Computer Networks,” IEEE International Symposium on Circuits and Systems, Workshop on Large-Scale Networks and Systems, Houston, Texas, April 1980.
Download