MULTI-OBJECTIVE LEARNING VIA GENETIC ALGORITHMS J. David Schaffer Department of E l e c t r i c a l Engineering John J, Grefenstette Department of Computer Science Vanderbilt University N a s h v i l l e , TN 37235 ABSTRACT Genetic algorithms (GAs) are powerful, general purpose adaptive search techniques which have been used successfully in a v a r i e t y of l e a r n i n g systems. In the standard f o r m u l a t i o n , GAs maintain a set of a l t e r n a t i v e knowledge structures for the task to be l e a r n e d , and improved knowledge s t r u c t u r e s are formed through a combination of competition and knowledge sharing among the a l t e r n a t i v e knowledge s t r u c t u r e s . In t h i s paper, we extend the GA paradigm by allowing multidimensional feedback concerning the performance of the a l t e r n a t i v e s t r u c t u r e s . The modified GA is shown to solve a m u l t i c l a s s p a t t e r n d i s c r i m i n a t i o n task which could not be solved by the unmodified GA. 1 - Introduction Pattern c l a s s i f i c a t i o n is a c e n t r a l task in f l e x i b l e systems which incorporate an arsenal of problem solving techniques. For example, a given problem instance may need to be c l a s s i f i e d in order to decide which problem solving method should be applied. This paper concerns the task of m u l t i class p a t t e r n d i s c r i m i n a t i o n in a l e a r n i n g system. As in M i t c h e l l [ 8 ] , we view l e a r n i n g as a search process. But rather than searching a space of conc e p t s , we consider a learning system which searches a space of production system programs for programs which adequately accomplish the desired p a t t e r n classification. The search is accomplished by means of a genetic algorithm (GA). GAs are powerf u l adaptive search techniques which have been used s u c c e s s f u l l y in a v a r i e t y of l e a r n i n g systems [ 3 , 4 , 5 , 1 2 ] . In t h i s paper, GAs are extended in order to perform m u l t i - o b j e c t i v e learning in a p a t t e r n c l a s s i f i c a t i o n domain. 2• Learning v i a Genetic Search. This section contains a b r i e f d e s c r i p t i o n of GAs. More d e t a i l e d d e s c r i p t i o n s are a v a i l a b l e in the l i t e r a t u r e [ 3 , 6 , 7 , 1 2 ] . B r i e f l y , GAs may be viewed as adaptive generate-and-test procedures. GAs are adaptive in the sense t h a t the candidate s o l u t i o n s generated r e f l e c t and e x p l o i t i n f o r m a t i o n obtained by e a r l i e r t e s t s . A GA maintains a popul a t i o n o f knowledge s t r u c t u r e s ( e . g . a l t e r n a t i v e sets of production rules for a given task) and repeatedly (1) selects s t r u c t u r e s on the basis of observed performance, and (2) applies i d e a l i z e d genetic operators to the selected s t r u c t u r e s to construct new s t r u c t u r e s . For example, one import a n t genetic operator is crossover, by which subsets of r u l e s may be exchanged between two a l t e r n a t i v e knowledge s t r u c t u r e s . This r e s u l t s in a s o p h i s t i c a t e d search in which subsets of r u l e s which c o n t r i b u t e to good performance are propagated through the p o p u l a t i o n . Other genetic operators are described in Smith [ 1 2 ] . In t h i s paper, we concentrate on the influence of the performance feedback on step (1) above, the s e l e c t i o n of knowledge s t r u c t u r e s for r e p r o d u c t i o n . The c h a r a c t e r i z a t i o n of the power and l i m i t a t i o n s of GAs is an a c t i v e research area, but p r e l iminary t h e o r e t i c a l r e s u l t s are a v a i l a b l e . For example, the number of s t r u c t u r e s in the population which contain a given subset of r u l e s can be expected to increase or decrease over time at a r a t e p r o p o r t i o n a l to the average observed p e r f o r mance of a l l knowledge s t r u c t u r e s which c o n t a i n t h a t set o f r u l e s [ 1 1 ] . Thus, a l l subsets o f r u l e s appearing in the population of knowledge s t r u c t u r e s are explored simultaneously in a near-optimal f a s h i o n , a phenomenon which is c a l l e d i m p l i c i t p a r a l l e l i s m by H o l l a n d [ 7 ] . Bethke[2] describes some p r o p e r t i e s of search spaces which may be espec i a l l y hard for GAs. Smith[12] implemented a GA-based machine l e a r n i n g system c a l l e d L S - 1 . In L S - 1 , each s t r u c ture maintained by the GA represents a production system (PS) program. Each PS program is evaluated on the l e a r n i n g task by a c r i t i c which assigns a numerical measure of f i t n e s s to the evaluated p r o gram. When a l l of the PS programs in the c u r r e n t population have been evaluated, the GA is invoked to construct a new population of PS programs, and the cycle is repeated. (See Figure 1.) LS-1 succ e s s f u l l y learned PS programs f o r maze tasks and f o r draw poker. Our work extends the LS-1 system i n order t o achieve m u l t i - o b j e c t i v e l e a r n i n g . 3• The Task Domain Multi-class pattern discrimination was selected as a r e p r e s e n t a t i v e m u l t i - o b j e c t i v e l e a r n ing t a s k . The s p e c i f i c task under i n v e s t i g a t i o n was to c l a s s i f y muscle a c t i v i t y p a t t e r n s for f i v e human g a i t c l a s s e s , representing one normal and four abnormal g a i t t y p e s . A t r a i n i n g set of 11 t e s t cases was obtained from the l i t e r a t u r e [ 1 ] , eaoh t r a i n i n g case c o n s i s t i n g of a 1 2 - b l t s t r i n g derived from EMG s i g n a l s from leg muscles w h i l e walking. As In L S - 1 , we used a GA to search a 594 J. Schaffer and J. Grefenstette f o r each class represented in the t r a i n i n g s e t . We then modified the GA so that complementary knowledge s t r u c t u r e s would share knowledge (through the genetic operators) rather than compete directly. space o f knowledge s t r u c t u r e s ( i . e . , r u l e s e t s ) . The goal of the l e a r n i n g system was to f i n d a knowledge s t r u c t u r e which c o r r e c t l y c l a s s i f i e s the 11 t r a i n i n g cases. The performance of the l e a r n i n g system was measured by the number of knowledge s t r u c t u r e s tested before o b t a i n i n g a s o l u t i o n . By choosing subsets of the t r a i n i n g cases, we derived 2 - c l a s s , 3 - c l a s s , 4 - c l a s s , and 5 - c l a s s d i s c r i m i n a t i o n problems. Each experiment was s t a r t e d w i t h an initial population of randomly generated knowledge s t r u c t u r e s , in order to t e s t the power of the GA w i t h no i n i t i a l knowledge. (Our implementation system[9] does a l l o w the I n c o r p o r a t i o n of h e u r i s t i c knowledge i n t o the i n i t i a l population of knowledge structures.) 4. The Need for Multidimensional Feedbaok A s e r i e s of experiments was performed in which a GA using a scalar c r i t i c was applied to m u l t i class d i s c r i m i n a t i o n problems. Although the GA could solve 2 - c l a s s problems, i t could not solve the f u l l f i v e - c l a s s problem. An a n a l y s i s of the i n d i v i d u a l PS programs generated by the GA at v a r i ous times during search revealed a common p a t t e r n . Knowledge of how to recognize a p a r t i c u l a r class was f r e q u e n t l y absent from l a t e r PS programs, even when such knowledge was present in e a r l i e r p r o grams. The problem is t h a t s t r u c t u r e s which cont a i n complementary knowledge are forced to compete by a GA using a scalar c r i t i c . Consider t h i s simple example: Suppose t h a t the c r i t i c measures the f i t n e s s of each s t r u c t u r e by counting the number of t r a i n i n g cases which were c o r r e c t l y c l a s s i f i e d . Suppose program P1 contains r u l e s which c o r r e c t l y c l a s s i f y classes A and B, and program P2 c o r r e c t l y c l a s s i f i e s only instances i n class C . If a l l classes are e q u a l l y represented in the t r a i n i n g s e t , then program P1 appears to be twice as " f i t " as program P2. Since the GA s e l e c t s programs f o r reproduction on the basis of the f i t n e s s assigned by the c r i t i c , P1 w i l l tend to c o n t r i b u t e r u l e subsets to twioe as many new programs in the next population as w i l l P2. As the number of classes i n c r e a s e s , s p e c i a l i z e d knowledge ( l i k e t h a t in P2) may tend t o s u f f e r e x t i n c t i o n , f i n a l l y r e s u l t i n g i n suboptimal performance f o r the l e a r n i n g system as a whole. Our s o l u t i o n was to modify the o r i t i c so t h a t a vector of performance measures was computed f o r eaoh s t r u o t u r e , w i t h one s l o t i n the f i t n e s s vector An i n t e r e s t i n g question emerges at t h i s p o i n t which was never an issue w i t h scalar c r i t i c s . Where should any punishment for i n c o r r e c t behavior be applied? Consider a knowledge s t r u c t u r e which i n c o r r e c t l y c l a s s i f i e s a class A case as class B. By applying the penalty to the A s l o t of the reward v e c t o r , we are punishing the f a i l u r e to do the r i g h t t h i n g . By applying it to the B s l o t , we are punishing doing the wrong t h i n g . The former s t r a tegy was adopted for a l l subsequent experiments, arguing t h a t class X t r a i n i n g cases should c o n t r i b u t e , p o s i t i v e l y or n e g a t i v e l y , only to the X s l o t of the reward v e c t o r . The m o d i f i c a t i o n to the basic GA is as f o l lows: instead of s e l e c t i n g s t r u c t u r e s f o r reproduct i o n on the basis of a s i n g l e f i t n e s s measure, a p o r t i o n of each new population is selected on the basis of each s l o t in the f i t n e s s v e c t o r . Note t h a t if a s t r u c t u r e scores w e l l on several measu r e s , then it w i l l tend to be chosen in several s e l e c t i o n phases. However, s t r u c t u r e s which perform w e l l on even one measure w i l l be given the opportunity to pass along t h e i r s p e c i a l i z e d knowledge. A f t e r the s e l e c t i o n phases, s t r u c t u r e s are combined v i a genetic operator j u s t as in L S - 1 . As a r e s u l t , our modified GA performs m u l t i o b j e c t i v e o p t i m i z a t i o n in the space of knowledge structures [ 1 0 ] . 5. The C r i t i c One important property of the c r i t i c in a GAbased system is t h a t the f i t n e s s reported by the c r i t i c must r e f l e c t more than j u s t success or f a i l u r e on the t a s k . Otherwise, the GA is unable to i d e n t i f y promising programs in the e a r l y stages when successes are r a r e , and cannot d i s c r i m i n a t e the b e t t e r programs in the l a t e r stages when they are p l e n t i f u l . One source of information of t h i s type is the amount of u n c e r t a i n t y e x h i b i t e d by a PS program while attempting the t a s k , where uncert a i n t y is defined as the extent to which c o n f l i c t r e s o l u t i o n is required in the decision process. A good c r i t i c should not discourage t h i s u n c e r t a i n t y in the e a r l y stages but should discourage it in the l a t e r stages. Some experiments w i t h d i f f e r e n t c r i t i c s of t h i s s o r t revealed something of the power of the GA to e x p l o i t subtle features in the c r i t i c . For example, a c r i t i c which applied a d e s i r a b l e u n c e r t a i n t y c o r r e c t i o n , but only rewarded success, was found to y i e l d populations r i o h in programs we c a l l e d s p e c i a l i s t s . A s p e c i a l i s t was a program which achieved a maximum score in one s l o t of the f i t n e s s v e c t o r , but zero i n a l l o t h e r s . It appeared to "know" only one aspect of the t a s k . U n f o r t u n a t e l y , these s p e c i a l i s t s were a c t u a l l y PS programs which were w i l d l y guessing in the sense t h a t they contained an o v e r l y general r u l e which suggested the same c l a s s i f i c a t i o n f o r every t r a i n ing case. A modified c r i t i c , designed to punish t h i s i n d i s c r i m i n a t e guessing, was found to be too strict. By making risk-taking behavior too J. Schaffer and J. Grefenstette dangerous, it led the GA to evolve programs which produced no c l a s s i f i c a t i o n . This strategy at l e a s t scores z e r o , which is b e t t e r than being excessively punished. F i n a l l y , a j u d i c i o u s balance of reward and punishment was achieved by i n c o r p o r a t i n g in the c r i t i c a scoring scheme inspired by the Scholastic Aptitude Test (SAT s c o r i n g ) . This scheme led the GA to consistent success on problems i n v o l v i n g 2, 3, 4 and 5 classes. 595 the c r i t i c i s too l a x , only rewarding successes, the GA evolves programs which guess w i l d l y to maximize the p o s s i b i l i t y of success. When the c r i t i c is too s t r i c t , the GA q u i c k l y learns than doing nothing is a good s t r a t e g y . Only j u d i c i o u s balancing of reward and punishment leads to e f f e c t i v e learning. REFERENCES 6. Results The r e s u l t s of the experiments w i t h 2, 3, 4 and 5-class d i s c r i m i n a t i o n learning are summarized in t a b l e 1. The f i g u r e s reported for number-ofe v a l u a t i o n s - t o - s o l u t i o n are averages from several trials. By successfully learning to solve the 5-class problem, the value of the vector-valued feedback to the l e a r n i n g component was demonstrated. It may be of i n t e r e s t t h a t the only attempt to solve t h i s problem by hand required a n o n - t r i v i a j . e f f o r t and produced a s o l u t i o n program containing 16 r u l e s . Although a l l s o l u t i o n s evolved by the GA used the f u l l allotment of 11 r u l e s , in a l l cases many of these r u l e s were not used in the s o l u t i o n of the l e a r n i n g t a s k . (They never f i r e d . ) They represented a kind of unexpressed genetic m a t e r i a l . The average number of f u n c t i o n a l rules in the s o l u t i o n s to the 5-class task was 7 . 5 . TABLE 1. RESULTS OF EXPERIMENTS IN MULTICLASS PATTERN DISCRIMINATION LEARNING WITH LS-2 Number of classes to discriminate 2 3 4 5 7. Number of evaluations to learn solution 1440 5647 15938 44309 Maximum number of r u l e s 4 6 8 11 Length of knowledge structures (bits) 136 228 328 462 Conclusions The use of a scalar c r i t i c l i m i t s the u s e f u l ness of GAs f o r m u l t i - o b j e c t i v e l e a r n i n g . The scalar c r i t i c forces competition between knowledge s t r u c t u r e s which contain rules f o r complementary aspects of the learning t a s k . We have shown t h a t GAs can be extended to perform m u l t i c r i t e r i a l e a r n ing through the use of a multidimensional feedback mechanism from the c r i t i c and a m o d i f i c a t i o n of the GA s e l e c t i o n procedure. A series of experiments w i t h an implemented system show t h a t GAs are opport u n i s t i c learning a l g o r i t h m s , r e q u i r i n g c a r e f u l design of the c r i t i c . In f a c t , the GA responded in a reasonable manner to the various c r i t i c s . When 1. A. B. Bekey, C. Chang, J. Perry, and M. M. H o f f e r , " P a t t e r n r e c o g n i t i o n of m u l t i p l e EMG signals applied to the d e s c r i p t i o n of human g a i t , " Proc. IEEE V o l . 65(5) (May 1977). 2. A. D. Bethke, Genetic algorithms as f u n o t i o n o p t i m i z e r s , Ph. D. Thesis, Dept. Computer and Communication Sciences, Univ. of Michigan (1981). 3. K. A. DeJong, "Adaptive system design: a genetic approach," IEEE Trans. S y s t . , Man, and Cyber. V o l . SMC-10(9), pp.566-574 (Sept. 1980). 4. J. M. F i t z p a t r i c k , J. J. G r e f e n s t e t t e , and D. Van-Gucht, "Image r e g i s t r a t i o n by genetio s e a r c h , " Proceedings of IEEE Southeastcon ' 8 4 , pp.460-464 ( A p r i l 1984). 5. D. Goldberg, Computer aided gas pipeline operation using genetic algorithms and r u l e l e a r n i n g , Ph. D. Thesis, Dept. C i v i l Eng., Univ. of Michigan (1983). 6. J. H. Holland, Adaptation in Natural and A r t i f i c i a l Systems, Univ. Michigan Press, Ann Arbor (1975). 7. M. L. Mauldin, "Maintaining d i v e r s i t y in genetic s e a r c h , " Proc. National Conf. on A I , AAAI 84, pp.247-250 (Aug. 1984). 8. T. M. M i t c h e l l , " G e n e r a l i z a t i o n as Search," A r t i f i c i a l I n t e l l i g e n c e V o l . 18 (1982). 9. J.D. Schaffer, Some experiments in machine l e a r n i n g using vector evaluated genetic a l g o r i t h m s , Ph. D. Thesis, Dept. of E l e c t r i c a l Engineering, Vanderbilt University (Dec. 1984). 10. J . D . Schaffer, " M u l t i p l e o b j e c t i v e o p t i m i z a t i o n w i t h vector-evaluated genetic a l g o r i t h m s , " submitted to I n t ' 1 Conf. on Genetic Algorithms and t h e i r A p p l i c a t i o n s , Carnegie-Mellon U n i v . , P i t t s b u r g h ( J u l y 1985). 11, S. F. Smith, A l e a r n i n g system baaed on genetic adaptive a l g o r i t h m s , Ph. D. T h e s i s , D e p t . Computer Science, Univ. of P i t t s b u r g h (1980). 12. S. F. Smith, " F l e x i b l e l e a r n i n g of problem s o l v i n g h e u r i s t i c s through adaptive s e a r c h , " Proc. of 8 t h IJCAI (Aug. 1983).