A Learning Classifier System for

advertisement
A Learning Classifier System for
Distributed Max-Flow Algorithm
Fault Detection
David Andrew Cape
November 30, 2005
Abstract
A distributed version of the Goldberg-Tarjan Max-Flow algorithm can be used in
power network simulation. Software faults can be intentionally injected into the
algorithm, and assertions have been developed to detect them. However, one wonders
whether it is possible to detect and classify the injected faults by examining the messagepassing history, without accessing the internal data of the program. This leads to the
general question of how much can be learned from the communication between
processors alone.
This research describes a learning classifier system (LCS), combining aspects of
XCS and S-classifiers, which attempts to diagnose the condition of the program by
examining the message-passing history from each processor’s perspective. One wants to
avoid processing the partial-order data structure of events, and the history of messages is
piggy-backed with each message, so each processor may have a slightly different history,
but one hopes that by processing the histories linearly by using the history file of each
processor, one processor at a time, that faults can be classified with a reasonable degree
of accuracy.
An LCS is appropriate for this effort because it would be difficult to define the
conditions which correspond to faults deterministically. Statement of results will appear
in the final draft.
Introduction
A distributed version of the Goldberg-Tarjan Max-Flow algorithm can be used in
power network simulation. Power network simulation is important because it can help in
the design and maintenance of the electric power grid. Some research is being done at
the University of Missouri – Rolla and elsewhere to try to develop techniques for
protecting the power grid from terrorist attacks and natural occurrences which could
cause line outages and lead to cascading failures (blackouts). The Max-Flow algorithm
can be used in these efforts.
Software faults can be intentionally injected into the algorithm, and assertions
have been developed to detect them. However, one wonders whether it is possible to
detect and classify the injected faults by examining the message-passing history, without
accessing the internal data of the program. This leads to the general question of how
much can be learned from the communication between processors alone. Please see
[AGMC] for details about the distributed Max-Flow algorithm and assertion-checking.
The relevant message types are categorized by the following description:
PFm: attempting to push flow to a neighboring vertex
AFm: accepting the requested flow
RFm: rejecting the requested flow
Distm: updating a node’s distance
Fm: a fault message indicating the detection of a fault by assertion violation
The injected faults are of the following form:
Edge fault:
a given edge’s flow is arbitrarily increased by 10%
Vertex fault:
a given vertex’s calculated excess flow is doubled
Lose all flow messages:
cease transmitting PFms
Randomly lose flow messages:
lose PFms with probability 0.1%
Alter all flow messages:
modify each PFm by 1 unit of flow
Randomly alter all flow messages: modify each PFm by 1 unit of flow with prob. 0.1%
Invert all accept/reject messages:
change AFms to RFms and vice versa
Randomly invert accept/reject messages:
change as above, with probability 0.1%
This research describes a learning classifier system (LCS), combining aspects of
XCS and S-classifiers [ES], which attempts to diagnose the condition of the program by
examining the message-passing history from each processor’s perspective. One wants to
avoid processing the partial-order data structure of events, and the history of messages is
piggy-backed with each message, so each processor may have a slightly different history,
but one hopes that by processing the histories linearly by using the history file of each
processor, one processor at a time, that faults can be classified with a reasonable degree
of accuracy.
An LCS is appropriate for this effort because it would be difficult to define the
conditions which correspond to faults deterministically. Statement of results will appear
in the final draft.
Design
The Michigan model for a learning classifier system will be used (one individual
is one rule). A rule is a tuple of the form <c, a, p, e, F>, where the letters stand for
condition, action, predicted payoff, accuracy, and fitness. The rule evaluation cycle will
be done according to the XCS framework described in [ES]. The only action for an
individual rule is the diagnosis of TRUE (fault detected). There will be a default rule
whose action is the diagnosis of FALSE (no fault detected) which applies only when no
other rule has a condition which matches the state of the environment.
The environment is the recent message-passing history (a sliding window of
length N) from the perspective of one of the processors used in the distributed max-flow
algorithm. A rule may diagnose TRUE at any time during the processing of the history
file, FALSE only at the end, and a reward will be allocated at the end of processing the
entire file.
The rule discovery cycle will follow the rule evaluation cycle and will be done
along the lines of GP, also described in [ES], because the conditions will be s-expressions
represented by parse trees. The terminals will be primitive statements such as xi = mj,
where xi is a variable representing the ith message in the window, and mj is a constant
message type. The operators in the parse tree will be AND and NOT, because OR can be
derived from the other two. The exact method of reproduction and competition has not
yet been decided, but more recombination will be used than mutation.
Experimental Setup
Basically, the goal is to see some improvement in the training phase. A testing
phase may be designed later to provide a framework for validating the work and
evaluating different parameter sets, but in this initial stage any learning evidenced by the
development of a rule set with increasing accuracy of diagnoses of faults will be
acceptable. Later, one might want to compare the accuracy of diagnosing different types
of faults. There is no formal experimental setup at this time, though.
Implementation
In progress – trying to have a basic Rule Evalution Cycle implemented by
Saturday, December 3, 2005, and Rule Discovery Cycle implemented by Monday,
December 5, 2005. Data processing (training) could occur from Tuesday to Thursday.
Results and Interpretation
None yet – trying to have some results by Thursday, December 8, 2005, and a
final draft with interpretation and conclusion by Friday, December 9, 2005.
Related Work
Austin Armbruster et al. [AGMC] have developed a framework for testing the
error-detection capabilities of certain assertions which examine the program state and
have obtained very good results. It is unlikely that this effort to use an LCS will match
that quality, because it examines only the message history and not the internal state of the
system. In some sense, this work could be seen as complementary to the assertionchecking work, not competing with it.
Conclusion
Waiting for Results and Interpretation, but the author is currently satisfied with
the design, and has been encouraged to proceed to the implementation phase, so good
results are hoped for.
Bibliography
[AGMC]
A. Armbruster, M. Gosnell, B. McMillin, M. Crow.
“Power Transmission Control Using Distributed Max-Flow”,
https://svn.umr.edu/research/csfil/trunk/papers/COMPSAC05_AA/Main.pdf
[ES] A.E. Eiben, J.E. Smith. Introduction to Evolutionary Computing. SpringerVerlag, Berlin Heidelberg, 2003.
Download