Multi-Objective Learning via Genetic Algorithms

advertisement
MULTI-OBJECTIVE LEARNING VIA GENETIC ALGORITHMS
J. David Schaffer
Department of E l e c t r i c a l Engineering
John J, Grefenstette
Department of Computer Science
Vanderbilt University
N a s h v i l l e , TN 37235
ABSTRACT
Genetic algorithms (GAs) are powerful, general
purpose adaptive search techniques which have been
used successfully in a v a r i e t y of l e a r n i n g systems.
In the standard f o r m u l a t i o n , GAs maintain a set of
a l t e r n a t i v e knowledge structures for the task to be
l e a r n e d , and improved knowledge s t r u c t u r e s are
formed through a combination of competition and
knowledge sharing among the a l t e r n a t i v e knowledge
s t r u c t u r e s . In t h i s paper, we extend the GA paradigm by allowing multidimensional feedback concerning the performance of the a l t e r n a t i v e s t r u c t u r e s .
The modified GA is shown to solve a m u l t i c l a s s p a t t e r n d i s c r i m i n a t i o n task which could not be solved
by the unmodified GA.
1 - Introduction
Pattern c l a s s i f i c a t i o n is a c e n t r a l task in
f l e x i b l e systems which incorporate an arsenal of
problem solving techniques. For example, a given
problem instance may need to be c l a s s i f i e d in order
to decide which problem solving method should be
applied.
This paper concerns the task of m u l t i class p a t t e r n d i s c r i m i n a t i o n in a l e a r n i n g system.
As in M i t c h e l l [ 8 ] , we view l e a r n i n g as a search
process. But rather than searching a space of conc e p t s , we consider a learning system which searches
a space of production system programs for programs
which adequately accomplish the desired p a t t e r n
classification.
The search is accomplished by
means of a genetic algorithm (GA). GAs are powerf u l adaptive search techniques which have been used
s u c c e s s f u l l y in a v a r i e t y of l e a r n i n g systems
[ 3 , 4 , 5 , 1 2 ] . In t h i s paper, GAs are extended in
order to perform m u l t i - o b j e c t i v e learning in a p a t t e r n c l a s s i f i c a t i o n domain.
2• Learning v i a Genetic Search.
This section contains a b r i e f d e s c r i p t i o n of
GAs.
More d e t a i l e d d e s c r i p t i o n s are a v a i l a b l e in
the l i t e r a t u r e [ 3 , 6 , 7 , 1 2 ] .
B r i e f l y , GAs may be
viewed as adaptive generate-and-test procedures.
GAs are adaptive in the sense t h a t the candidate
s o l u t i o n s generated r e f l e c t and e x p l o i t i n f o r m a t i o n
obtained by e a r l i e r t e s t s . A GA maintains a popul a t i o n o f knowledge s t r u c t u r e s ( e . g . a l t e r n a t i v e
sets of production rules for a given task) and
repeatedly (1)
selects s t r u c t u r e s on the basis of
observed performance, and (2) applies i d e a l i z e d
genetic operators to the selected s t r u c t u r e s to
construct new s t r u c t u r e s . For example, one import a n t genetic operator is crossover, by which subsets of r u l e s may be exchanged between two a l t e r n a t i v e knowledge s t r u c t u r e s .
This r e s u l t s in a
s o p h i s t i c a t e d search in which subsets of r u l e s
which c o n t r i b u t e to good performance are propagated
through the p o p u l a t i o n .
Other genetic operators
are described in Smith [ 1 2 ] . In t h i s paper, we
concentrate on the influence of the performance
feedback on step (1) above, the s e l e c t i o n of
knowledge s t r u c t u r e s for r e p r o d u c t i o n .
The c h a r a c t e r i z a t i o n of the power and l i m i t a t i o n s of GAs is an a c t i v e research area, but p r e l iminary t h e o r e t i c a l r e s u l t s are a v a i l a b l e .
For
example, the number of s t r u c t u r e s in the population
which contain a given subset of r u l e s can be
expected to increase or decrease over time at a
r a t e p r o p o r t i o n a l to the average observed p e r f o r mance of a l l knowledge s t r u c t u r e s which c o n t a i n
t h a t set o f r u l e s [ 1 1 ] . Thus, a l l subsets o f r u l e s
appearing in the population of knowledge s t r u c t u r e s
are explored simultaneously in a
near-optimal
f a s h i o n , a phenomenon which is c a l l e d i m p l i c i t
p a r a l l e l i s m by H o l l a n d [ 7 ] .
Bethke[2] describes
some p r o p e r t i e s of search spaces which may be espec i a l l y hard for GAs.
Smith[12] implemented a
GA-based
machine
l e a r n i n g system c a l l e d L S - 1 . In L S - 1 , each s t r u c ture maintained by the GA represents a production
system (PS) program. Each PS program is evaluated
on the l e a r n i n g task by a c r i t i c which assigns a
numerical measure of f i t n e s s to the evaluated p r o gram. When a l l of the PS programs in the c u r r e n t
population have been evaluated, the GA is invoked
to construct a new population of PS programs, and
the cycle is repeated. (See Figure 1.) LS-1 succ e s s f u l l y learned PS programs f o r maze tasks and
f o r draw poker. Our work extends the LS-1 system
i n order t o achieve m u l t i - o b j e c t i v e l e a r n i n g .
3• The Task Domain
Multi-class
pattern
discrimination
was
selected as a r e p r e s e n t a t i v e m u l t i - o b j e c t i v e l e a r n ing t a s k . The s p e c i f i c task under i n v e s t i g a t i o n
was to c l a s s i f y muscle a c t i v i t y p a t t e r n s for f i v e
human g a i t c l a s s e s , representing one normal and
four abnormal g a i t t y p e s .
A t r a i n i n g set of 11
t e s t cases was obtained from the l i t e r a t u r e [ 1 ] ,
eaoh t r a i n i n g case c o n s i s t i n g of a 1 2 - b l t s t r i n g
derived from EMG s i g n a l s from leg muscles w h i l e
walking.
As In L S - 1 , we used a GA to search a
594
J. Schaffer and J. Grefenstette
f o r each class represented in the t r a i n i n g s e t . We
then modified the GA
so
that
complementary
knowledge s t r u c t u r e s would share knowledge (through
the genetic
operators)
rather
than
compete
directly.
space o f knowledge s t r u c t u r e s ( i . e . , r u l e s e t s ) .
The goal of the l e a r n i n g system was to f i n d a
knowledge s t r u c t u r e which c o r r e c t l y c l a s s i f i e s the
11 t r a i n i n g cases. The performance of the l e a r n i n g
system was measured by the number of knowledge
s t r u c t u r e s tested before o b t a i n i n g a s o l u t i o n . By
choosing subsets of the t r a i n i n g cases, we derived
2 - c l a s s , 3 - c l a s s , 4 - c l a s s , and 5 - c l a s s d i s c r i m i n a t i o n problems. Each experiment was s t a r t e d w i t h an
initial
population of randomly generated knowledge
s t r u c t u r e s , in order to t e s t the power of the GA
w i t h no i n i t i a l knowledge.
(Our implementation
system[9] does a l l o w the I n c o r p o r a t i o n of h e u r i s t i c
knowledge i n t o the i n i t i a l population of knowledge
structures.)
4. The Need for Multidimensional
Feedbaok
A s e r i e s of experiments was performed in which
a GA using a scalar c r i t i c was applied to m u l t i class d i s c r i m i n a t i o n problems.
Although the GA
could solve 2 - c l a s s problems, i t could not solve
the f u l l f i v e - c l a s s problem. An a n a l y s i s of the
i n d i v i d u a l PS programs generated by the GA at v a r i ous times during search revealed a common p a t t e r n .
Knowledge of how to recognize a p a r t i c u l a r class
was f r e q u e n t l y absent from l a t e r PS programs, even
when such knowledge was present in e a r l i e r p r o grams. The problem is t h a t s t r u c t u r e s which cont a i n complementary knowledge are forced to compete
by a GA using a scalar c r i t i c . Consider t h i s simple example: Suppose t h a t the c r i t i c measures the
f i t n e s s of each s t r u c t u r e by counting the number of
t r a i n i n g cases which were c o r r e c t l y c l a s s i f i e d .
Suppose program P1 contains r u l e s which c o r r e c t l y
c l a s s i f y classes A and B, and program P2 c o r r e c t l y
c l a s s i f i e s only instances i n class C .
If a l l
classes are e q u a l l y represented in the t r a i n i n g
s e t , then program P1 appears to be twice as " f i t "
as program P2. Since the GA s e l e c t s programs f o r
reproduction on the basis of the f i t n e s s assigned
by the c r i t i c , P1 w i l l tend to c o n t r i b u t e r u l e subsets to twioe as many new programs in the next
population as w i l l P2. As the number of classes
i n c r e a s e s , s p e c i a l i z e d knowledge ( l i k e t h a t in P2)
may tend t o s u f f e r e x t i n c t i o n , f i n a l l y r e s u l t i n g i n
suboptimal performance f o r the l e a r n i n g system as a
whole.
Our s o l u t i o n was to modify the o r i t i c so t h a t
a vector of performance measures was computed f o r
eaoh s t r u o t u r e , w i t h one s l o t i n the f i t n e s s vector
An i n t e r e s t i n g question emerges at t h i s p o i n t
which was never an issue w i t h scalar c r i t i c s . Where
should any punishment for i n c o r r e c t behavior be
applied?
Consider a knowledge s t r u c t u r e which
i n c o r r e c t l y c l a s s i f i e s a class A case as class B.
By applying the penalty to the A s l o t of the reward
v e c t o r , we are punishing the f a i l u r e to do the
r i g h t t h i n g . By applying it to the B s l o t , we are
punishing doing the wrong t h i n g . The former s t r a tegy was adopted for a l l subsequent experiments,
arguing t h a t class X t r a i n i n g cases should c o n t r i b u t e , p o s i t i v e l y or n e g a t i v e l y , only to the X s l o t
of the reward v e c t o r .
The m o d i f i c a t i o n to the basic GA is as f o l lows: instead of s e l e c t i n g s t r u c t u r e s f o r reproduct i o n on the basis of a s i n g l e f i t n e s s measure, a
p o r t i o n of each new population is selected on the
basis of each s l o t in the f i t n e s s v e c t o r .
Note
t h a t if a s t r u c t u r e scores w e l l on several measu r e s , then it w i l l tend to be chosen in several
s e l e c t i o n phases.
However, s t r u c t u r e s which perform w e l l on even one measure w i l l be given the
opportunity
to
pass
along t h e i r s p e c i a l i z e d
knowledge. A f t e r the s e l e c t i o n phases, s t r u c t u r e s
are combined v i a genetic operator j u s t as in L S - 1 .
As a r e s u l t , our modified GA performs m u l t i o b j e c t i v e o p t i m i z a t i o n in the space of knowledge
structures [ 1 0 ] .
5. The C r i t i c
One important property of the c r i t i c in a GAbased system is t h a t the f i t n e s s reported by the
c r i t i c must r e f l e c t more than j u s t success or
f a i l u r e on the t a s k . Otherwise, the GA is unable
to i d e n t i f y promising programs in the e a r l y stages
when successes are r a r e , and cannot d i s c r i m i n a t e
the b e t t e r programs in the l a t e r stages when they
are p l e n t i f u l .
One source of information of t h i s
type is the amount of u n c e r t a i n t y e x h i b i t e d by a PS
program while attempting the t a s k , where uncert a i n t y is defined as the extent to which c o n f l i c t
r e s o l u t i o n is required in the decision process. A
good c r i t i c should not discourage t h i s u n c e r t a i n t y
in the e a r l y stages but should discourage it in the
l a t e r stages.
Some experiments w i t h d i f f e r e n t c r i t i c s of t h i s s o r t revealed something of the power
of the GA to e x p l o i t subtle features in the c r i t i c .
For example, a c r i t i c which applied a d e s i r a b l e
u n c e r t a i n t y c o r r e c t i o n , but only rewarded success,
was found to y i e l d populations r i o h in programs we
c a l l e d s p e c i a l i s t s . A s p e c i a l i s t was a program
which achieved a maximum score in one s l o t of the
f i t n e s s v e c t o r , but zero i n a l l o t h e r s .
It
appeared to "know" only one aspect of the t a s k .
U n f o r t u n a t e l y , these s p e c i a l i s t s were a c t u a l l y PS
programs which were w i l d l y guessing in the sense
t h a t they contained an o v e r l y general r u l e which
suggested the same c l a s s i f i c a t i o n f o r every t r a i n ing case. A modified c r i t i c , designed to punish
t h i s i n d i s c r i m i n a t e guessing, was found to be too
strict.
By making
risk-taking
behavior
too
J. Schaffer and J. Grefenstette
dangerous, it led the GA to evolve programs which
produced no c l a s s i f i c a t i o n . This strategy at l e a s t
scores z e r o , which is b e t t e r than being excessively
punished. F i n a l l y , a j u d i c i o u s balance of reward
and punishment was achieved by i n c o r p o r a t i n g in the
c r i t i c a scoring scheme inspired by the Scholastic
Aptitude Test (SAT s c o r i n g ) . This scheme led the
GA to consistent success on problems i n v o l v i n g 2,
3, 4 and 5 classes.
595
the c r i t i c i s too l a x , only rewarding successes,
the GA evolves programs which guess w i l d l y to maximize the p o s s i b i l i t y of success. When the c r i t i c
is too s t r i c t , the GA q u i c k l y learns than doing
nothing is a good s t r a t e g y . Only j u d i c i o u s balancing of reward and punishment leads to e f f e c t i v e
learning.
REFERENCES
6. Results
The r e s u l t s of the experiments w i t h 2, 3, 4
and 5-class d i s c r i m i n a t i o n learning are summarized
in t a b l e 1. The f i g u r e s reported for number-ofe v a l u a t i o n s - t o - s o l u t i o n are averages from several
trials.
By successfully learning to solve the 5-class
problem, the value of the vector-valued feedback to
the l e a r n i n g component was demonstrated. It may be
of i n t e r e s t t h a t the only attempt to solve t h i s
problem by hand required a n o n - t r i v i a j . e f f o r t and
produced a s o l u t i o n program containing 16 r u l e s .
Although a l l s o l u t i o n s evolved by the GA used the
f u l l allotment of 11 r u l e s , in a l l cases many of
these r u l e s were not used in the s o l u t i o n of the
l e a r n i n g t a s k . (They never f i r e d . ) They represented
a kind of unexpressed genetic m a t e r i a l . The average number of f u n c t i o n a l rules in the s o l u t i o n s to
the 5-class task was 7 . 5 .
TABLE 1. RESULTS OF EXPERIMENTS IN MULTICLASS
PATTERN DISCRIMINATION LEARNING
WITH LS-2
Number of
classes to
discriminate
2
3
4
5
7.
Number of
evaluations
to learn
solution
1440
5647
15938
44309
Maximum
number
of r u l e s
4
6
8
11
Length of
knowledge
structures
(bits)
136
228
328
462
Conclusions
The use of a scalar c r i t i c l i m i t s the u s e f u l ness of GAs f o r m u l t i - o b j e c t i v e l e a r n i n g . The
scalar c r i t i c forces competition between knowledge
s t r u c t u r e s which contain rules f o r complementary
aspects of the learning t a s k . We have shown t h a t
GAs can be extended to perform m u l t i c r i t e r i a l e a r n ing through the use of a multidimensional feedback
mechanism from the c r i t i c and a m o d i f i c a t i o n of the
GA s e l e c t i o n procedure. A series of experiments
w i t h an implemented system show t h a t GAs are opport u n i s t i c learning a l g o r i t h m s , r e q u i r i n g c a r e f u l
design of the c r i t i c . In f a c t , the GA responded in
a reasonable manner to the various c r i t i c s .
When
1.
A. B. Bekey, C. Chang, J. Perry, and M. M.
H o f f e r , " P a t t e r n r e c o g n i t i o n of m u l t i p l e EMG
signals applied to the d e s c r i p t i o n of human
g a i t , " Proc. IEEE V o l . 65(5) (May 1977).
2.
A. D. Bethke, Genetic algorithms as f u n o t i o n
o p t i m i z e r s , Ph. D. Thesis, Dept. Computer and
Communication Sciences, Univ. of
Michigan
(1981).
3.
K. A. DeJong, "Adaptive system design: a
genetic approach," IEEE Trans. S y s t . , Man, and
Cyber. V o l .
SMC-10(9),
pp.566-574
(Sept.
1980).
4.
J. M. F i t z p a t r i c k , J. J. G r e f e n s t e t t e , and D.
Van-Gucht,
"Image r e g i s t r a t i o n by genetio
s e a r c h , " Proceedings of IEEE Southeastcon ' 8 4 ,
pp.460-464 ( A p r i l 1984).
5.
D. Goldberg, Computer aided gas
pipeline
operation using genetic algorithms and r u l e
l e a r n i n g , Ph. D. Thesis, Dept. C i v i l Eng.,
Univ. of Michigan (1983).
6.
J. H. Holland, Adaptation in Natural and
A r t i f i c i a l Systems, Univ. Michigan Press, Ann
Arbor (1975).
7.
M. L. Mauldin, "Maintaining d i v e r s i t y
in
genetic s e a r c h , " Proc. National Conf. on A I ,
AAAI 84, pp.247-250 (Aug. 1984).
8.
T. M. M i t c h e l l , " G e n e r a l i z a t i o n as Search,"
A r t i f i c i a l I n t e l l i g e n c e V o l . 18 (1982).
9.
J.D. Schaffer, Some experiments in machine
l e a r n i n g using vector evaluated genetic a l g o r i t h m s , Ph. D. Thesis, Dept. of E l e c t r i c a l
Engineering,
Vanderbilt
University
(Dec.
1984).
10.
J . D . Schaffer, " M u l t i p l e o b j e c t i v e o p t i m i z a t i o n w i t h vector-evaluated genetic a l g o r i t h m s , "
submitted to I n t ' 1 Conf. on Genetic Algorithms
and t h e i r A p p l i c a t i o n s , Carnegie-Mellon U n i v . ,
P i t t s b u r g h ( J u l y 1985).
11,
S. F. Smith, A l e a r n i n g system baaed on
genetic adaptive a l g o r i t h m s , Ph. D. T h e s i s ,
D e p t . Computer Science, Univ. of P i t t s b u r g h
(1980).
12.
S. F. Smith, " F l e x i b l e l e a r n i n g of problem
s o l v i n g h e u r i s t i c s through adaptive s e a r c h , "
Proc. of 8 t h IJCAI (Aug. 1983).
Download