Implementation of online distributed system-level diagnosis

advertisement
IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 5, MAY 1992
616
Implementation of On-Line Distributed
System-Level Diagnosis Theory
Ronald P. Bianchini, Jr., Member, IEEE, and Richard W. Buskens, Student Member, IEEE
Abstract- There has been significant theoretical research in
the area of system-level diagnosis. This paper documents the first
practical application and implementation of on-line distributed
system-level diagnosis theory. Proven distributed diagnosis algorithms are shown to be impractical in real systems due to high
resource requirements. A new distributed system-level diagnosis
algorithm, called Adaptive DSD,is shown to minimize network
resources and has resulted in a practical implementation.
Adaptive DSD assumes a distributed network, in which network nodes can test other nodes and determine them to be faulty
or fault-free. Tests are issued from each node adaptively, and
depend on the fault situation of the network. Test result reports
are generated from test results and forwarded between nodes in
the network. Adaptive DSD is proven correct in that each faultfree node reaches an accurate independent diagnosis of the fault
conditions of the remaining nodes. No restriction is placed on the
number of faulty nodes; any fault situation with any number of
faulty nodes is diagnosed correctly.
The Adaptive DSD algorithm is implemented and currently
monitors over 200 workstations in the Electrical and Computer
Engineering Department at Carnegie Mellon University. The
algorithm has executed continuously for the past year, even
though no single workstation has remained fault-free over that
period. Key results of this paper include: an overview of previous
distributed system-level diagnosis algorithms, the specification of
a new adaptive distributed system-level diagnosis algorithm, its
comparison to previous centralized adaptive and distributed nonadaptive schemes, its application to an actual distributed network
environment, and the experimentation within that environment.
Fig. 1. Example diagnosis network.
The fault model of the network characterizes the outcome of
test results given the fault status of the nodes involved in the
tests. This work assumes the “symmetric invalidation fault
model of system diagnosis [12]. In this model, the outcome of
a test performed by a fault-free node is accurate and equals the
fault state of the node being tested. Tests performed by faulty
nodes are inaccurate and results of such tests may be arbitrary.
Classical system-level diagnosis research [ 121, [4] assumes
a central observer that performs diagnosis. This observer accurately receives all test results and uses the results to determine
the fault situation. The work presented herein assumes a
distributed model, such that each node performs its own local
diagnosis of the network. In general it is not practical for
every node to perform diagnosis by testing every other node
Index Terms- Adaptive diagnosis, diagnosis algorithms, dis- in the network. Thus, each node performs tests of only a
tributed computer networks, self-diagnosable systems, systemsubset of the nodes and receives test result reports from other
level diagnosis.
nodes about nodes that it does not test. Report validation is
required since faulty nodes can distribute inaccurate test result
I. INTRODUCTION
reports. Typically, report validation requires additional testing.
ONSIDER a network of distributed nodes in an intercon- Fig. 2 illustrates an example of utilizing test result reports to
nect network. Each node is assigned a fault state, s,, such perform diagnosis. In the figure, nodes nk and n, are faulty,
that s, E {fault-free(O), faulty(1)). Interconnect, or link faults all other nodes are fault-free. Node n, tests both n3 and nk
are not considered in this model. The fault situation of the and determines n3 to be fault-free and nk to be faulty. Since
network is the set of node fault states. Nodes perform tests of n, determines nJ to be fault-free, n, can utilize test result
other nodes; a test of node nb by n, is identified t a b such that reports from n J .Thus, n, can correctly diagnose nl as faultt a b E {fault-free(O), faulty(1)). The syndrome of the network
free and n, as faulty without directly testing those nodes.
is the set of all test results. Diagnosis is the determination Since n, determines nk to be faulty, n, cannot diagnose the
of the fault situation of a network given its syndrome. Fig. 1 state of nn.
presents an example network. Node labels identify the fault
System-level diagnosis research was introduced by
state of each node. A n arc from n, to nb represents a test Preparata, Metze, and Chien in [12]. They defined tperformed by n, of nb and is labeled with the test result.
diagnosability as the ability to diagnose a fault situation with
t or fewer faults given the syndrome of a network. They
proved the following necessary condition given a fixed interManuscript received July 8, 1991; revised December 4, 1991.
node testing assignment: If a network is t-diagnosable then
The authors are with the Department of Electrical and Computer Engineerevery node must be tested by at least t other nodes. Hakimi
ing, Carnegie Mellon University, Pittsburgh, PA 15213.
and Amin [4] proved that this condition is sufficient if no two
IEEE Log Number 9108213.
”
C
001&9340/92$03.00
0 1992 IEEE
617
BlANCHINl AND BUSKENS: DISTRIBUTED SYSTEM-LEVEL DIAGNOSIS THEORY
Fig. 2. Forward fault-free paths from 3,.
nodes test each other. Hakimi and Nakajima [15] showed that
the number of tests required for diagnosis could be reduced
by dynamically adapting the testing assignment based on the
fault situation. Their work assumed the presence of a central
observer to determine which inter-node tests were necessary
for diagnosis. There has been a large body of further theoretical
developments [7], including the diagnosability of new failure
modes [ 141 and research in distributed diagnosis algorithms
PI, [91, ~ 1 [117
, PI.
The remainder of this paper is outlined as follows. Section I1
provides an overview of two important distributed systemlevel diagnosis algorithms: NEW-SELF and EVENT-SELF,
and describes major drawbacks of implementing the algorithms. In Section 111, the Adaptive DSD algorithm is introduced and implementation enhancements are given. A performance comparison between the major distributed diagnosis
algorithms is made in Section IV. Section V describes an
implementation of Adaptive DSD, and provides experimental
results based on that implementation. Concluding remarks are
given in Section VI.
11. DISTRIBUTED
SYSTEM-LEVEL
DIAGNOSIS
A . The NE W-SELF Algorithm
On-line distributed diagnosis algorithms are given in [8],
[9], [6], [1],and [Z]. The SELF distributed diagnosis algorithm
was presented by Hosseini, Kuhl, and Reddy [8]. In that work
it is assumed that the maximum number of faulty nodes is
bounded by a predefined limit, t, and there is a fixed testing
assignment such that a node is responsible for testing a fixed
set of neighboring nodes. Fault-free nodes forward test result
reports to neighboring nodes; reports reach nonneighboring
nodes through intermediate nodes. No assumption is made
about faulty nodes, which may distribute erroneous test result
reports. Each node independently determines a diagnosis of
the network utilizing the test result reports that it generates
and receives.
The NEW-SELF on-line distributed diagnosis algorithm is
presented in [6]. The algorithm assumes a fixed inter-node
testing assignment and is executed on-line, permitting node
failure and repair. In the NEW-SELF algorithm each node
tests its neighboring nodes and generates a test result report for
each test result. The report is stored locally, overwriting previous reports concerning the tested node, and is subsequently
forwarded to all testers of the testing node. The algorithm
ensures the accuracy of test result reports by restricting the
forwarding of these reports to occur between fault-free nodes.
A node only accepts information from other nodes that it
tests and determines to be fault-free. As evident by this
specification, valid test result reports are forwarded between
fault-free nodes in the reverse direction of tests performed by
the nodes. The following testing and report validation scheme
is utilized:
1) ni tests nj as fault-free,
2) ni receives test result reports from n j ,
3) ni tests nj as fault-free,
4) n; assumes the diagnostic information received in Step
2 is valid.
This scheme assumes that a node cannot fail and then
recover from that failure in an undetected fashion during the
interval between two tests by another node.
For correct diagnosis, the NEW-SELF algorithm requires
that every fault-free node receives all test result reports generated by all other fault-free nodes. It is proven in [6] that
this condition is satisfied if each node is tested by at least
t 1 other nodes. This is shown since a fault-free node is
guaranteed to be forwarding test result reports to at least one
other fault-free node if it is tested by t 1 nodes, where the
number of node failures is restricted to t or fewer.
Although provably correct, the NEW-SELF algorithm has
considerable drawbacks for implementation on a large number
of distributed workstations. Consider a network of N nodes
that is required to be t-diagnosable. The algorithm requires at
least N ( t 1) tests since each node is tested by at least t 1
other nodes. The number of messages required to transfer the
test result reports is N 2 ( t + 1 ) 2 .This is shown since each node
generates one test result report for each of the t 1 nodes it
tests, resulting in N ( t 1) reports. Every node subsequently
forwards all of the messages that it receives to its t 1 testers.
The number of messages required for algorithm execution is
considerable, even for small networks. For example, a network
of N = 8 nodes with t = 2 diagnosability requires 576
messages.
+
+
+
+
+
+
+
B. The EVENT-SELF Algorithm
The EVENT-SELF algorithm [ l ] extended the NEW-SELF
algorithm by addressing the resource limitations of actual
distributed networks. This algorithm utilized “event driven”
forwarding of test result reports to lower the number of
messages required by the algorithm. Test result reports are
only forwarded by a node if they differ from reports already
stored at the node. In this manner, only reports that signify a
new fault event in the network will get forwarded. It is proven
in [ l ] that there are only two situations when a node must
forward test result reports to its testers. The first is when a
differing test result report is received by the node. This is a
report indicating that a fault event has occurred, representing
a node becoming either faulty or fault-free. In this case, the
report is forwarded to all of the testers of the reporting node.
The second case occurs when a node diagnoses one of its
testers as faulty, and then receives a report that the tester is
fault-free. In this situation, the node must forward all of its
test result reports to that tester, to ensure that the tester has all
current test result reports and can perform its own diagnosis.
IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 5, MAY 1992
618
Using this approach the EVENT-SELF algorithm message
count is significantly reduced from that of NEW-SELF. In
steady state, only tests are performed and no test result
reports are forwarded in the network. Additional messages are
required only when a node changes state. This is denoted by
A f , or the change in the number of faulty nodes, f . For a
t-diagnosable network the number of messages required for
test result reports is reduced to A f N t 2 .
A significant improvement in diagnosis latency is also
possible using EVENT-SELF. Diagnosis latency is the time
from the detection of a fault event to the time when all nodes
correctly diagnose the event. The diagnosis latency of the
SELF algorithms corresponds to the time required to forward
the test result reports corresponding to the fault event to all
nodes. In NEW-SELF, when new messages arrive at node n,
from n3, validation of the messages must wait until another
test of n3 is accomplished. In EVENT-SELF, validation tests
for messages received from n3 are initiated as soon as the
messages arrive. Using this scheme, message validation time
can be significantly less than a testing period.
Fig 3. Limited diagnosability.
Fig. 4. Data structure maintained at node AV2.
C. Drawbacks of the SELF Algorithms
There are two significant drawbacks to the SELF algorithms.
The first drawback is illustrated in Fig. 3 and concerns limited
diagnosability. For any nonadaptive diagnosis algorithm, only
a limited number of node failures can be diagnosed. For a
t-diagnosable network, correct diagnosis is guaranteed for t
or fewer faults. The two faulted nodes, shown shaded in the
figure, result in test result reports that are not forwarded to all
fault-free nodes.
The second major drawback of the SELF algorithms concerns redundancy, in terms of both inter-node testing and
report forwarding. For t-diagnosable systems, each node must
be tested by at least t 1 other nodes. Ideally, each node
must be tested by only one fault-free node to ensure correct
diagnosis, thus all but one of the t 1 tests are redundant.
Also, since each node is tested by at least t 1 nodes, test
result reports are forwarded along redundant paths. Ideally,
only one forwarding path is required.
n, has received diagnostic information from a fault-free node
specifying that ni has tested n3 and found it to be fault-free.
Fig. 4 shows the TESTED-UP2 array maintained at 712 for
an eight node system with n1, n4, n; faulty. Note that “2”
represents an entry that is arbitrary.
The Adaptive DSD algorithm executes at each node by first
identifying another unique fault-free node and then updating
local diagnostic information with information received from
that node. Functionally, this is accomplished as follows. List
the nodes in sequential order, as ( 7 ~ 0 n1,.
,
. . , n ~ - l )Node
.
n, identifies the next sequential fault-free node in the list,
sequentially testing consecutive nodes n,+l, n,+2, etc., until
a fault-free node is found. Diagnostic information received
from the tested fault-free node is assumed to be valid and is
utilized to update local information. All addition is modulo N
so that the last fault-free node in the ordered list identifies the
first fault-free node in the list.
The Adaptive DSD algorithm is given in Fig. 5. Each
111. ADAPTIVE
DISTRIBUTED
SYSTEM-LEVEL
DIAGNOSIS
node n, executes the algorithm at predefined testing intervals.
The Adaptive DSD algorithm differs considerably from the Instructions 1 and 2 identify ny as the first fault-free node
SELF algorithms in that the testing assignment is adaptive and after n, in the ordered node list. The test at Step 2.3 evaluates
determined by the fault situation. Node failures and repairs are to “fault-free’’ if ny has remained fault-free since the last
considered; link failures are not. The Adaptive DSD algorithm test by n,, including the period required for ny to forward
further differs from the SELF algorithms in that the number TESTED-UP, in Step 2.2. This ensures that the diagnostic
of nodes in the fault set is not bounded. The remaining fault- information included in TESTED-UP, received at Step 2.2
free nodes correctly diagnose the fault states of all nodes in is accurate. Instructions 3 and 4 update local diagnostic
the system.
information dependent on both the fault-free test of n, and
the diagnostic information received from ny. Instruction 3
A. Algorithm Specification
asserts TESTED-UP,[z] = y, specifying that n, has tested
A n example of the data structure required by the Adaptive ny and determined it to be fault-free. In Instruction 4, all
other elements of TESTED-UP, are updated to the values of
DSD algorithm is shown in Fig. 4. The array TESTED-UP,
is maintained at each node n,. TESTED-UP, contains N TESTED-UP,. Thus, the diagnostic information contained in
elements, indexed by node identifier, z, as TESTED-UP, [i], the TESTED-UP arrays is forwarded between nodes in the
for 0 5 5 N - 1. Each element of TESTED-UP, contains a reverse direction of tests. In this example, the information
node identifier. The entry TESTED-UP,[z] = j indicated that is forwarded from ny to n,. Note that Step 4.1 prevents a
+
+
+
619
BlANCHINl AND BUSKENS: DISTRIBUTED SYSTEM-LEVEL DIAGNOSIS THEORY
I* ADAPTIVE-DSD
/* The following is executed at each
/* at predefined testing intervals.
1.
2.
y
= .r:
+ 1)I l l O d s:
y = (y
2.2.
2.3.
3.
request n
TESTED-UP,[r] = y:
4.
for i = 0 to S - 1
4.1.1.
*I
*I
*I
repeat {
2.1.
4.1.
. 5 .r 5 Z - 1
71,~0
,to forward TESTED-UP,
} until (n.,. tests n
if ( I
#
to n :
as “fault-free”);
Fig. 6. Example system and test set.
.r)
TESTED-UP,[i] = TESTED-UP, [I]:
Fig. 5. The Adaptive DSD Algorithm,
1.
node from replacing diagnostic information that it determines
through normal testing procedures with information that it
receives from other fault-free nodes.
Since n, continues testing nodes in Step 2 until a fault-free
node is found, the test set is dependent on the fault situation.
The test set of an example system of eight nodes is shown
in Fig. 6. In the example, 121, n4, and nj are faulty, all other
nodes are fault-free. The Adaptive DSD algorithm specifies
that a node sequentially tests consecutive nodes until a faultfree node is identified. For example, no tests n1, finds it to be
faulty and continues testing. Subsequently, n o tests node 7 1 2 ,
finds it to be fault-free and stops testing. Node n2 finds 713 to
be fault-free and stops testing immediately. Node 723 must test
three nodes before it tests a fault-free node. The TESTED-UP2
array maintained at n2 for this example is shown in Fig. 4.
Diagnosis is accomplished at any node nz by following the
fault-free paths from n, to other fault-free nodes. The Diagnose algorithm to be executed by a node n, is given in Fig. 7.
The algorithm uses the information stored in TESTED-UP, to
diagnose the system. Its results are stored in an array, STATE,,
where STATE,[i] represents the diagnosed state of node 7 1 , .
For correct diagnosis, STATE, [ i ] must equal the actual fault
state of node n, for all z. It is proven in Section 111-B that,
after execution of Adaptive DSD, entries of TESTED-UP,
corresponding to fault-free nodes are the same at each faultfree node. The Diagnose algorithm utilizes these fault-free
entries of TESTED-UP, and operates as follows (refer to
Fig. 7). Initially, all nodes are identified as faulty in Step
1. In Step 2, nodeqointer is set to 2, the identifier of the
node executing Diagnose. Step 3 of the algorithm traverses
the forward fault-free paths in the test set, labeling each
of the nodes visited as fault-free. This is accomplished by
setting STATE,[nodegointer] to fault-free and then setting
nodegointer to TESTED-UP, [nodegointer],which identifies the next sequential fault-free node in the system. Step 3
is repeated until nodegointer is set to every fault-free node
and returns to x.
The correctness proof of the Adaptive DSD algorithm is
given in Section 111-B. The key steps of the algorithm proof
show that the tests performed by the fault-free nodes form
a directed cycle among all the fault-free nodes, after the
Adaptive DSD algorithm is executed at least once at every
1.1.
I* DIAGNOSE
I* The following is executed at each 72,. 0 5 s
I* when n, desires diagnosis of the system.
for = 0 to S - 1
nodegointer = s;
3.
3.1.
repeat {
STATE, [nodegointer] = fault-free;
3.3.
*I
*I
*/
STATE,[z] =faulty;
2.
3.2.
5 S-1
nodegointer = TESTED-UP, [nodepointer]
} until (nodepointer == s);
Fig. 7. The Diagnose Algorithm.
node. It is then shown that the entries of the TESTED-UP
array corresponding to the tests performed by fault-free nodes
are consistent at all fault-free nodes after further execution of
the Adaptive DSD algorithm. The Diagnose algorithm utilizes
only the fault-free entries of TESTED-UP, for diagnosis and
thus operates correctly at all fault-free nodes.
B. Correctness Proof
The correctness proof of the Adaptive DSD algorithm utilizes algorithm “testing rounds.” A testing round of Adaptive
DSD is defined as the period of time such that Adaptive DSD
executes at least once on every fault-free node in the system.
Correctness of the Adaptive DSD algorithm is proven in three
steps. In the first step, it is proven that, after a single testing
round, there exists a directed path from any fault-free node to
any other fault-free node in the testing graph T ( S ) .Second,
it is proven that fault-free entries of the TESTED-UP arrays
are the same at each fault-free node after a fixed number of
testing rounds following a fault event. Finally, it is proven that
the Diagnose algorithm will correctly diagnose the state of all
nodes in the system.
Theorem I : Given diagnosis system S = ( V ( S ) E
! (S),
T ( S ) ) fault
,
situation F ( S ) ,and one testing round of Adaptive
DSD, T ( S ) will contain a directed path from any fault-free
node in V ( S )to any other fault-free node in V ( S ) .
Proof: (by contradiction): Choose two fault-free nodes,
nz and n,, such that there does not exist a directed path from
n, to n, and ( z - x) is minimized for all such n, and n,.
Identify the largest y, y < z , such that ny is fault-free and
there exists a path from nz to nu.Refer to Fig. 8. By definition
of Adaptive DSD, after a single testing round, nIy must have
tested one fault-free node, na.Since the largest y was chosen
IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 5, MAY 1992
620
Fig. 8. Testing paths in T ( S ) .
The execution of the Diagnose algorithm at n, requires
the fault-free entries of TESTED-UP,. By Theorem 2, these
entries are proven to be consistent at all fault-free nodes. Thus,
the execution of the Diagnose algorithm will yield the same
result when executed at any fault-free node.
The Adaptive DSD algorithm, as presented, is optimal
in terms of the total number of tests required, since each
node is tested by at most one other node. TO reduce other
algorithm resource requirements and hCreaSe implementation
performance, algorithm enhancements are employed.
such that y < z , then a must be greater than z . By definition
of the algorithm, ny must have tested n, before testing n, and
must have found it to be faulty. This results in a contradiction
since n, is selected as fault-free.
An interesting result of Theorem 1 is that T ( S ) contains a c. Transient Behavior
directed cycle of the fault-free nodes.
As shown in Section 111-B the Adaptive DSD algorithm
c O r o l l a ~l": Given diagnosis system = ('(')I
E ( S ) * yields provably correct diagnosis after a "convergence period"
situation
and One testing round Of Adaptive following
T(S)),
a fault event, However, correct diagnosis is not
then T ( S )
contain a directed cyc1e7 consisting Of guaranteed during this period. The problem occurs when faulty
every fault-free node of V ( S ) .
nodes are repaired and become fault-free. The newly repaired
By Theorem there
a directed path between any two node requires finite time to identify a single fault-free node.
nodes in T ( S ) . In addition, by
'pecificaBefore a fault-free node is identified, the other fault-free
there is Only a sing1e arc from each
node to nodes utilize old test result reports received from that node,
One Other
node in T ( S ) .A directed cyc1e is the Only resulting in incorrect diagnosis. This situation can be identified
graph structure that satisfies both of these conditions.
by a break in the testing cycle, and is aggravated in actual
Theorem 2' Give diagnosis system
= ('1,
E ( S ) ' systems where newly repaired nodes require appreciable time
T ( S ) ) ,fault situation F ( S ) , N testing rounds of execution to identify a fault-free node.
Of Adaptive DSD, and
nodes n Z and nY such that
Fig. 9 illustrates a node repair sequence that exhibits inn~ tests '%. Then, for all
'%,TEsTED-up~[2]
= y' correct transient diagnosis. Node n3 is faulty in Fig. 9(a),
proof: Choose an arbitrary
node nx' See requiring n2 to test 713 and nq. Node n2 detects that n3 is
Fig' By
specification, TESTED-UPZ
= Y after repaired in Fig. 9(b) and begins testing only n3. However, if n3
one testing rC3und. After two testing rounds, node n w will have
has not yet tested nq then TESTED-UP~ is invalid. This causes
tested n X and received its
array'
after
a break in the testing cycle. Since the Diagnose algorithm
testing
TESTED-UPw[xI
= mSTED-UPZ[z] = follows fault-free paths in the testing cycle it will determine
y. This step iterates as each fault-free node in the directed
an incorrect diagnosis of the fault situation. In Fig. 9(c), n3
cyc1e identified in
receives TESTED-UP, [I.'
determines n4 to be fault-free, restoring the testing cycle.
Note that information flows backwards
the cyc1e. The Henceforth, the Diagnose algorithm correctly diagnoses the
longest path around the cycle contains N nodes. Thus,
fault situation.
after no more than N testing rounds all fault-free nodes
Transient incorrect diagnosis is avoided by requiring ad%,
receive TESTED-UP, ].[ and for d'
ditional temporary testing in the Adaptive DSD algorithm.
TESTED-UPi[z] = y.
As a natural consequence of the Adaptive DSD algorithm,
Thus, it is shown using Theorem 2 that the TESTED-UPz[i]
node n2 will test both nodes n3 and n4 as long as n3 is
entries corresponding to
%7
are the Same at faulty. For proper diagnosis, n2 must continue to test nq7even
every fault-free node, n,.
after it has determined that n3 has become fault-free. Only
Theorem 3' Given diagnosis system
= ('(')lE('),
once n3 has identified n4 as fault-free can n2 stop testing
situation F ( S ) ,and testing rounds Of Adaptive n4. The additional temporary testing overhead incurred is the
DSD' Then? the Diagnose
executed at any
testing required for n3 to identify n4 as fault-free and report
node, will correctly determine F ( S ) .
this information to n2 before n2 stops testing n 4 . If this
proof: Choose a
node n x to
the procedure is followed, then incorrect transient diagnosis does
Diagnose algorithm. Initially, the state of every node is set not occur.
to faulty in Step 1. Step 2 identifies n, as the first node to
be determined fault-free by setting nodepointer = x. Step 3
identifies n,,de-poznteras fault-free and updates nodeqointer D. Information Updating
to another fault-free node, specifically, nodegointer =
Although the Adaptive DSD algorithm is optimal in terms
TESTED-UP,[nodeqointer].By Corollary 1.1, all fault-free of test count, it requires more than the minimum number of
nodes are contained in a directed cycle in T ( S ) .By Theorem diagnostic messages. This is because a node will generate a
2, nodeqointer is updated, in Step 3.2, from its current value test result report for each test result it obtains. For nodes whose
to the next consecutive fault-free node in the directed cycle of state remains the same over several tests, duplicate reports are
T ( S ) .Since each fault-free node in a directed cycle uniquely generated and forwarded. This is wasteful and unnecessary. A
identifies one other fault-free node, each fault-free node is time stamping scheme like that presented in [l] is employed
identified by nodeqointer exactly once in Step 3.
to permit nodes to transfer new diagnosis information only
F ( S ) 7
DSDy
''
'"
T(S))7
!I
621
BlANCHlNl AND BUSKENS: DISTRIBUTED SYSTEM-LEVEL DIAGNOSIS THEORY
(a)
(b)
(c)
Fig. 9. Possible event sequence for repaired .V3
(a)
(b)
(c)
Fig. 10. Different report forwarding schemes.
during Step 2.2 of Adaptive DSD. The total message count is
minimized using this scheme since each node receives a single
message for every change in TESTED-UP.
E. Event Driven Information
This enhancement addresses the diagnosis latency of the
algorithm and assumes the information updating enhancement.
When a new diagnostic message arrives at nz, n, stores the
message in TESTED-UP,. At this time, nz can determine
correct diagnosis. The new information is not forwarded until
a reauest for the information arrives from another node.
However, if nz can identify the node that the message will
be forwarded to, it can forward the message when it arrives.
This scheme is termed Event Driven since information is
forwarded when the event occurs rather than by explicit
requests. The number of test result reports remains the same
as the information updating scheme, but the diagnosis latency
is reduced.
Fig. 11. Asymmetric information forwarding of faulted NI.
Fig, 12. Example multicast operation.
requires each node to forward only one additional report, yet
reduces the path length a report must travel from N to log, N .
F. Asymmetric Report Forwarding
Asymmetric report fowarding further reduces diagnosis latency by forwarding diagnosis information along redundant
communication paths, different from those utilized for testing.
Three different information forwarding schemes are illustrated
in Fig. 10 for the event of n o detecting n1 as faulty. Tests are
identified by shaded arcs and tests result reports are forwarded
along solid arcs.
Fig. 10(a) illustrates symmetric forwarding, where test result
reports are forwarded only in the reverse direction of tests. This
scheme requires the lowest number of test result reports forwarded from each node and has the highest diagnosis latency.
The forwarding scheme utilized by the SELF algorithms is
illustrated in Fig. 10(b). Each node forwards test result reports
to t other nodes. The scheme illustrated in Fig. 1O(c) requires
high message count at no but has the minimum diagnosis
latency of one message delay.
The asymmetric report forwarding scheme utilized in the
final implementation of Adaptive DSD is illustrated in Fig. 11.
Using this scheme, no forwards the test result report to n4 and
727. Nodes n.4 and n7 each forward the report to two additional
nodes. In this implementation, the reports forwarded along the
solid arcs require only two arcs to reach n2 versus six arcs
for symmetric forwarding. The structure represented by the
solid arcs is a balanced binary tree, with longest path log, N .
A binary tree is chosen as the forwarding structure since it
G. Multicast Information Forwarding
Fig, 1O(c) represents the ideal forwarding scheme for minimum diagnosis latency. This forwarding scheme is not practical in real systems since it requires no to forward a test
result report to all other fault-free nodes. For common interconnection buses, such as Ethernet, multicasting can be
used to reduce the overhead associated with forwarding a
message to all nodes. Multicasting allows one source to
broadcast a single message, on a shared communication bus,
to many destinations. Fig. 12 illustrates a multicast operation.
Multicasting can be accomplished on an Ethernet using IP
level [ 111 protocols.
By using a multicast approach, each node still receives
a single message for every fault event, but the number of
messages placed on the communication bus is substantially
reduced. In Ethernet networks, the presence of routers must be
considered. A router permits the interconnection of multiple
Ethernet networks, or subnets. Multicast messages on an
Ethernet are not forwarded through routers. Hence, a special
procedure is required to ensure that multicasted test result
reports are forwarded to all nodes in the network, regardless
of the presence of routers and that the reports are properly
verified. In the first step of the procedure, the original report is
multicasted by the originator of the report. In the second step,
each node that received the multicasted report forwards the
IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 5, MAY 1992
622
Fig. 13. Example multicast operation with subnets.
report, symmetrically to the next subsequent node. Nodes that
receive the symmetrically forwarded report without a multicast
message must be located on a different subnet. These nodes
re-execute the given multicast procedure to distribute the new
message on their subnet with the minimum possible diagnosis
latency. The presented multicast procedure is illustrated in
Fig. 13.
Using this procedure, a total of N
S test result reports
are forwarded, where N is the total number of nodes in the
system and S is the number of subnets. This is shown, since
each node forwards one message symmetrically, plus one node
in each subnet multicasts a message. Since two message delays
are required to distribute the diagnostic message per subnet,
the total diagnosis latency is a function of 2s message delays.
+
TABLE 1
ALGORITHM
DIAGNOSABILITY
AND TEST COUNT
I
I
I
Aleorithm Diaenosabilitv
SELF Algorithms
All Information
Forwarding Schemes
I
Adaptive DSD
s-1
t
(a)
Testing Count Per Testing Round
SELF Algorithms
All Information
Forwarding Schemes
.V(t
+ 1)
I
Adaptive DSD
N
TABLE 11
ALGORITHM
MESSAGE
COUNT
Message Count Per Testing Round
IV. DISTRIBUTED
DIAGNOSIS
ALGORITHM
COMPARISON I
I SELF Algorithms I AdaDtive DSD I
All
Information
O(SV)
.v,
Table I(a) shows the diagnosability of the algorithms disLvA
f
Information Updating
cussed. Algorithm diagnosability is the maximum number of
Event Driven
L\-Af
faulty nodes that are permitted for the algorithm to guarantee
0( S A f P )
correct diagnosis. The SELF algorithms are t-diagnosable, a
1.5SAf
Asymmetric Forwarding
function of the predefined fixed testing topology required. The
Multicast Forwarding
(IV S ) A f
testing topology of the Adaptive DSD algorithms varies based
on the fault situation; algorithm diagnosability is N - 1 for
forwards diagnosis information along redundant paths. The
these algorithms.
Table I(b) shows the number of tests required by each multicast algorithm requires ( N S)Af messages since each
algorithm. The SELF algorithms require N ( t 1) tests since fault event is first multicasted to all nodes and then verified
each node must be tested by t 1 other nodes. The Adaptive between subsequent node pairs.
Table 111 identifies the diagnosis latency of each algorithm.
algorithms require N tests. Since every node of any distributed
diagnosis system must be tested by one of the fault-free nodes, The diagnosis latency is the time required for all fault-free
N is the minimum number of tests possible. Thus, Adaptive nodes in the diagnosis system to reach a correct diagnosis
after a fault event. As proven in Section 111-B,Adaptive DSD
DSD is optimal in terms of the number of tests required.
Table I1 identifies the number of messages that contain test requires N report forwarding delays to distribute new test
result reports. In the SELF algorithms, each message contains result reports to every node. Thus, the diagnosis latency is
the triple [ A , B , C ]where
,
A, B , and C are node identifiers. N(T,), where T, represents the time of a testing interval. The
Self algorithms require N / ( t 1) testing intervals since there
The Adaptive DSD algorithm requires that each TESTED-UP
array gets forwarded in a testing round. Thus, N messages are multiple paths between nodes in the test set, including
of size N are required and recorded as N 2 in Table 11. The paths of length N / ( t 1).The test result reports require less
message counts of the event driven and information updating time to be forwarded to all nodes in the system.
schemes are functions of the number of fault events. Adaptive
The event driven algorithms have significantly reduced
DSD with information updating forwards each fault event, diagnosis latency. In the nonevent driven algorithms, test result
A f , to each node, thus the total message count is N A f . The reports arrive at a node and are not forwarded until the reports
message count is optimal since each node must receive at are requested during the next testing interval. In the event
least one message for every fault event. This message count is driven schemes, the node receiving the report immediately
the same for Event Driven Adaptive DSD. The asymmetric validates the message, then forwards it to subsequent nodes.
forwarding algorithm requires 1.5NAf messages since it Thus, the report is forwarded after the time required for a
+
+
+
+
+
+
BIANCHINI AND BUSKENS: DISTRIBUTED SYSTEM-LEVEL DIAGNOSIS THEORY
TABLE 111
ALGORITHM
DIAGNOSIS
LATENCY
623
B. Experimentation
Experimentation of the Adaptive DSD algorithm on the
CMU ECE network focused on algorithm communication
overhead, in terms of average packet count, and diagnosis
All Information
latency, measured in seconds. The following figures graph the
Information Updating
communication overhead as a function of experiment elapsed
Event Driven
time. In addition, important events are marked, including
Asymmetric Forwarding
fault occurrence and diagnosis. The first figure illustrates the
Multicast Forwarding
execution of the Adaptive DSD algorithm with symmetric
information forwarding. The subsequent two figures illustrate
fault-free test, Ttest,which is significantly less than a testing the performance of the Adaptive DSD algorithm with asymcycle in our implementation. The asymmetric adaptive algo- metric forwarding. In each experiment, the diagnosis system
rithm further reduces diagnosis latency by utilizing redundant consists of 60 nodes and the algorithm executes with a 30 s
shorter paths, the longest of which contains log, N nodes. The test interval. Every node performs its own data collection for
multicast algorithm requires 2STteStlatency since the original the packet counts shown in the figures, which are collected at
fault event message is multicasted to all nodes on each subnet 10 s intervals throughout each experiment.
Experiments 1 and 2 demonstrate the difference between
simultaneously.
symmetric and asymmetric forwarding. See Figs. 14 and
15. Both experiments involve the failure and subsequent
v. IMPLEMENTATION AND EXPERIMENTATION
recovery of a single node. Symmetric forwarding is utilized
in Experiment 1 and asymmetric forwarding is utilized in
A. Implementation
Experiment 2. At 60 s during Experiment 1, a single node
Adaptive DSD has been running in the CMU ECE departin the network fails. The faulted node is detected at 110 s,
ment since November 1990 on various workstations using the
after it is tested and a test timeout period occurs. After 110 s
Ultrix operating system, including VAX and DEC 3100 RISC
the fault information is forwarded to the remaining fault-free
workstations. The algorithm consists of approximately 3000
nodes. Since diagnosis information is validated by testing, the
line of C code, written in modular format to make it easily
fault information will reach the farthest node from the failure
portable. The network interface for this implementation uses
the Berkeley socket interface [13] and presently supports Eth- only after all nodes between it and the fault are tested and
ernet IPKJDP protocols [3], [ 111. Appropriate modifications found to be fault-free. Thus, at time 510, the node farthest
to the network module will allow the program to run on any from the fault receives the information indicating the node
failure. This results in an overall diagnosis latency of 450 s.
system that has a C compiler.
At 960 s the faulty node is repaired. The newly recovered
Adaptive DSD is implemented as a modular, event-driven
node
immediately performs forward tests up to the limit of
program. A configuration file is read by each workstation
at startup that identifies the complete list of workstations five, as specified in the configuration file. This causes the
participating in system diagnosis, as well as specifying a sharp increase in packet count at time 960. At time 970, the
number of tuning parameters. Algorithm tuning parameters recovered node is detected. This information is propagated
include the maximum number of forward tests in a single backward through the path of fault-free nodes until it reaches
test interval, various timeout values, and flags that enable the fault-free node farthest from the recovered node, at 1430 s.
or disable algorithm options. An activity scheduler plays a Correct diagnosis is achieved within 460 s. After 1430 s the
significant role in the implementation by permitting events packet counts return to nominal levels.
As shown in Experiment 1, the diagnosis latency of Adapsuch as workstation tests, packet retransmissions, and other
tive
DSD with symmetric forwarding is a linear function of
timeouts to be scheduled for execution at a specified time. As
with EVENT-SELF, the workstation test is implemented as the number of system nodes and can be significant for large
a separate program that is spawned as a subprocess to test systems. Experiment 2, shown in Fig. 15, illustrates the same
experiment with asymmetric forwarding. The diagnosis latency
several of the hardware facilities of the workstation.
Workstations participating in system diagnosis are initially is significantly reduced. The diagnosis latency for the failure
sorted by Internet host address. Since this number is unique is 60 s for asymmetric forwarding versus 400 s for symmetric
to each workstation, all workstations generate identical sorted forwarding. The same diagnostic information is forwarded,
lists. Testing occurs in the forward direction of the sorted list; except that it is forwarded closer in time to the fault event. This
i.e., each workstation tests those workstations that follow it in results in a higher peak message count with shorter duration.
the sorted list, modulo the number of workstations. Informa- The remaining experiments utilize asymmetric forwarding to
tion forwarding occurs in the reverse direction, or backwards provide reduced diagnosis latencies.
Fig. 16 illustrates one advantage of Adaptive DSD over
in the sorted list. Due to Internet standard subnet routing
[lo], workstations with numerically similar host addresses both of the SELF algorithms: the ability to correctly diagnose
are located on a single subnet. The sorted arrangement of the state of a network under the presence of many faults. In
workstations tends to minimize the load on routers and bridges Experiment 3, 50 of the 60 nodes experience simultaneous
as a result of inter-subnet communication.
failures at 60 s. The average packet count initially reduces
Wors
IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 5, MAY 1992
624
1.2
1
0.8
0.6
I
I
I
1
1
1
I
I1
I
I
I
I1
I
1
1
I
II
I1
I
I
I
I
1
I
I1
I1
I
I
I
I
I
1
1
1
1
1
I
I
I1
11
I
I
I
I1
I1
II
I
I
1
I1
I
I
I
I
I
1
1
1
1
I
I
I
11
I1
I
I
I1
I1
I
060
I
0.4
0.2
0
1
I
I
I
110
510
I
I
I
I
I
,
I
I
I
I
I
1430
II
960970
timc(smn&)
Fig. 14. Experiment 1 on a 60 node testing network
3
2.5
1
0.5
0
b
$0
9b
140
1
I
1
1
I
1
I
1
I
1
1
1
I
I
I
I
I
I
54d \ 6 0 6 ! 0
time(sccondr)
Fig. 15. Experiment 2 on a 60 node testing network.
significantly since 50 nodes cease transmitting messages. The
first faulty node is detected at 90 s, and the remaining fault-free
nodes re-establish a cycle in the test set. At this time, complete
diagnostic information is forwarded among these nodes. After
the 360 s diagnosis latency, the packet counts reduce to their
new nominal values. At time 960, one of the 50 failed nodes
returns to the network. The usual recovery detection occurs,
and diagnostic information is exchanged. After 90 s, complete
diagnosis among the fault-free nodes is established.
Fig. 17 compares message counts for Adaptive DSD to those
for the SELF algorithms for a single failure and subsequent
recovery. Due to the high number of diagnostic messages
generated by the NEW-SELF algorithm and the available
network bandwidth, the diagnosis system is limited to twenty
nodes. The algorithms executed in Experiment 4 shown in
Fig. 17 use the same configuration parameters as the first three
experiments: 30 s test interval, packet bundling, asymmetric
forwarding, and a maximum of t = 5 forward tests per test
interval. Adaptive DSD has lower communication overhead
and reduced diagnosis latency. This is verified in Table 111.
Observed message counts reflect those calculated in Table 11.
VI. CONCLUSION
The Adaptive DSD algorithm has been specified and implemented. Unlike previous distributed system-level diagnosis
algorithms, the testing assignment is adaptive and varies during
algorithm execution. The testing assignment adapts locally at
each node, yet the algorithm is provably globally correct.
Diagnosability of the Adaptive DSD algorithm is optimal
since correct diagnosis is guaranteed for any set of node
BIANCHINI AND BUSKENS: DISTRIBUTED SYSTEM-LEVEL DIAGNOSIS THEORY
Fig. 16. Experiment 3 on a 60 node testing network.
401
4
.. ..
.. .. ... .. ... ... ... . . . . . . (....' . . . . .. ... . . . . . ... .. . ... . .. ... . . . . . . . . . . . .. ... .. . .
. . . . . . . . . . . . . . . .
New Sdf
.... . . . . . . . . . . . . . . . . . .... . . . . . . . . . . .
.. . .. .. .. . . . .. . . . .
. .
_ ' '
Fig. 17. Experiment 4 on a 20 node testing network.
failures. In addition, the number of tests performed is optimal,
since each node is tested by a single fault-free node. Using
symmetric forwarding, each node receives a single test result
report per fault event, hence the number of test result reports
is optimal. Symmetric forwarding suffers from the longest
possible diagnosis latency, however, since a test result report
is forwarded between every fault-free node before it reaches
the last node. A direct tradeoff is presented that permits
improvements in diagnosis latency while requiring additional
reports to be forwarded. For special interconnection networks,
multicasting can be used to reduce both the diagnosis latency
and the number of test result reports forwarded.
Adaptive DSD has been running on various workstations
of the CMU ECE department since November 1990. Previous
nonadaptive versions have been running at Carnegie Mellon
since April, 1989. Since its inception at Carnegie Mellon,
greater reliance has been placed on the DSD system by system
administrators. The current system is used to diagnose faulty
workstations within seconds of failure. In addition the system
is used to determine the cause of failures during the presence
of increased fault activity.
Current research focuses on methods of distributed systemlevel diagnosis for arbitrary network interconnection topologies, and investigating other features such as handling link
failures, and dynamic entry and exit of nodes into and out of
the diagnosis system.
REFERENCES
[ l ] R. P. Bianchini, Jr., K. Goodwin, and D. S. Nydick, "Practical application and implementation of distributed system-level diagnosis theory,"
in Proc. Twentieth Int. Symp. Fault-Tolerant Comput., IEEE, June 1990,
pp. 332-339,
626
[2] R.P. Bianchini, Jr. and R. Buskens, “ A n adaptive distributed systemlevel diagnosis algorithm and its implementation,” in Proc. Twenty-First
In?. Symp. Fault-Tolerant Comput., IEEE, June 1991.
[3] The Ethernet: A Local Area Network. 2.0 edition, Digital Equipment
Corp., Intel Corp., Xerox Corp., 1982. Data Link Layer and Physical
Layer Specification.
[4] S. L. Hakimi and A. T. Amin, “Characterization of connection assignment of diagnosable systems,” IEEE Trans. Comput., vol. C-23, Jan.
1974.
[5] S. L. Hakimi and E. F. Schmeichel, “An adaptive algorithm for system
level diagnosis,” J. Algorithms, vol. 5, June 1984.
[6] S. H. Hosseini, J. G. Kuhl, and S. M. Reddy, “A diagnosis algorithm for
distributed computing systems with dynamic failure and repair,” IEEE
Trans. Comput., vol. C-33, pp. 223-233, Mar. 1984.
[7] E. Kreutzer and S. L. Hakimi, “System-level fault diagnosis: A survey,”
Euromicro J., vol. 20 no. 4 3 , pp. 323-330, May 1987.
[8] J. G. Kuhl and S. M. Reddy, “Distributed fault-tolerance for large
multiprocessor systems,’’ in Proc. 7thAnnu. Symp. Comput.Architecture,
IEEE, May 1980, pp. 23-30.
“Fault-diagnosis in fully distributed systems,” in Proc. I l t h Int.
[9] -,
Con& Fault-Tolerant Comput., IEEE, June 1981, pp. 100- 105.
[lo] J. C. Mogul and J. B. Postel, “Internet standard subnetting procedure,”
Tech. Rep., NSF-NetRFC 950, Aug. 1985.
[ l l ] J. B. Postel, “Internet protocol,” Tech. Rep., NSF-NetRFC 791, Sept.
1981.
[12] F. P. Preparata, G. Metze, and R. T. Chien, “On the connection assignment problem of diagnosable systems,” IEEE Trans. Electron. Comput.,
vol. EC-16, pp. 848-854, Dec. 1967.
[13] UNIX Programmer’s Manual: Socket, The University of California at
Berkeley, 1986.
[14] C.-L. Yang and G.M. Masson, “Hybrid fault diagnosability with unreliable communication links,” in Proc. Fault-Tolerant Comput. Syst.,
IEEE, July 1986, pp. 226-231.
[15] S. L. Hakimi and K. Nakajima, “On adaptive system diagnosis,” IEEE
Trans. Comput., vol. C-33, pp. 234-240, Mar. 1984.
[16] F. J. Meyer and D. K. Pradhan, “Dynamic testing strategy for distributed
systems,” IEEE Trans. Comput., vol. (2-38, pp. 356-365, Mar. 1989.
IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 5, MAY 1992
Ronald P. Bianchini, Jr. (S’SO-M’SS) was born
in Brooklyn, NY, on April 29, 1962. He received
the B.S. degree in electrical engineering from the
Massachusetts Institute of Technology in 1983 and
the M.S. and Ph.D. degrees in electrical and computer engineering from Carnegie Mellon University
in 1985 and 1989, respectively.
He aided the New York University Ultracomputer
project during the summer of 1983 in the area of
wireability. He consulted for AT&T Bell Laboratories during the summers of 1990 and 1991. in
the application of a fault diagnosis system to AT&T research networks.
Currently, he is an Assistant Professor in the Department of Electrical
and Computer Engineering at Carnegie Mellon University, Pittsburgh, PA.
He directs research groups in the study of fault diagnosis in distributed
systems and the design of telecommunication switching architectures. His research interests include system-level diagnosis, distributed computer systems,
telecommunication switching, and computer architecture.
Dr. Bianchini is a member of the IEEE Computer Society, the Association
for Computing Machinery, and was nominated to the Eta Kappa Nu Honor
Society in 1983.
I
1.
Richard W. Buskens (S’85) received the B.S. degree in computer engineering and the M.S. degree in
computer science from the University of Manitoba,
Manitoba, Canada.
He is currently a Ph.D. student in the Department
of Electrical and Computer Engineering at Carnegie
Mellon University, Pittsburgh, PA. His research
interests include applied graph theory, computer
networks, and parallel and distributed algorithms.
Download