Dr. Eick's Introduction to Machine Learning with EC

advertisement
Ch. Eick: Evolutionary Machine Learning
Learning Paradigms and
General Aspects of Learning



Different Forms of Learning:
• Learning agent receives feedback with respect to its actions
(e.g. from a teacher)
– Supervised Learning: feedback is received with respect
to all possible actions of the agent
– Reinforcement Learning: feedback is only received with
respect to the taken action of the agent
• Unsupervised Learning: Learning when there is no hint at all
about the correct action
Inductive Learning is a form of supervised learning that centers
on learning a function based on sets of training examples.
Popular inductive learning techniques include decision trees,
neural networks, nearest neighbor approaches, discriminant
analysis, and regression.
The performance of an inductive learning system is usually
evaluated using n-fold cross-validation.
Ch. Eick: Evolutionary Machine Learning
Classifier Systems




According to Goldberg [113], a classifier system is “a machine learning system
that learns syntactically simple string rules to guide its performance in an
arbitrary environment”.
A classifier system consists of three main components:
• Rule and message system
• Apportionment of credit system
• Genetic Algorithm (for evolving classifers)
First implemented in a system called CS1 by Holland/Reitman(1978).
Example of classifer rules:
00##:0000
00#0:1100
11##:1000
##00:0001


Fitness of a classifier is defined by its surrounding environments that pays
payoff to classifiers and extract fees from classifiers.
Classifier systems employ a Michigan approach (populations consist of single
rules) in the context of an externally defined fitness function.
Ch. Eick: Evolutionary Machine Learning
Challenges in Developing Michiganstyle Classifier Systems
Example:
r2
r1
r3
r4
Reward +5
r5
r6






We need a set of rules that can solve problems collaboratively --- comparable
to find a good soccer or baseball team
We want to have a set of rules that cover all the important situations, and not a
set of rules that can only handle very specialized situations --- coverage is an
important issue!
Delayed rewards pose particular problems
Only rules responsible for a chosen decision should be reward/penalized
‘Lazy’ and inactive rules need to be removed
Rules whose reward behavior is predictable or preferable over rules whose
reward behavior is harder to predict; prediction error and experience in XCS
Ch. Eick: Evolutionary Machine Learning
Bucket Brigade Algorithm




Developed by Holland for the apportionment of credits that relies on
the model of a service economy, consisting of two main components:
auction and a clearing house.
The environment as well as the classifiers post messages.
Each classifier maintains a bank account that measures its strength.
Classifiers that match a posted string, make a bid proportial to their
strength. Usually, the highest bidding classifier is selected to post its
message (other, more parallel schemes are also used)
The auction permits appropriate classifiers to post their messages.
Once a classifier is selected for activation, it must clear its payments
through a clearing house paying its bid to other classifiers or the
environment for matching messages rendered. A matched and
activated classifier sends its bid to those classifiers responsible for
sending messages that matched the bidding classifiers conditions. The
sent bid-money is distributed in some manner between those
classifiers.
Ch. Eick: Evolutionary Machine Learning
Bucket Bridgade (continued)





Rules that cooperate with a classifier are rewarded by receiving the
classifiers bid, the last classifier in a chain receives the environmental
reward, all the other classifiers receive the reward from their
predecessor.
A classifier’s strength might be subject to taxation. The idea that
underlies taxation is to punish inactive classifiers: Ti(t):=ctaxSi(t)
The strength of a classifier is updated using the following equation:
Si(t+1)= Si(t) - Pi(t) - Ti(t) + Ri(t)
A classifier bids proportional to its strength: Bi=cbidSi
Genetic algorithms are used to evolve classifiers. A classifiers strength
defines its fitness, fitter classifiers reproduce with higher probability
(e.g. roulette wheel might be employed) and binary string mutation and
crossover operators are used to generate new classifiers. Newly
generated classifiers replace weak, low strength classifier (other
schemes such as crowding could also be employed).
Ch. Eick: Evolutionary Machine Learning
Pittburgh-style Systems






Populations consist of rule-sets, and not of individual rules.
No bucket brigade algorithms is necessary.
Mechanisms to evaluate individual rules are usually missing.
Michigan-style systems are geared towards applications with
dynamically changing requirements (“models of adaptation”); Pitt-style
systems rely on more static environments assuming a fixed fitness
function for rule-sets that are not necessary in the Michigan approach.
Pittsburgh approach systems usually have to cope with variable length
chromosomes.
Popular Pittsburgh-style systems include:
•
•
•
•
Smith’s LS-1-system (learns symbolic rule-sets)
Janikov’s GIL system (learns symbolic rules; employs operators of
Michalski’s inductive learning theory as its genetic operators)
Giordana&Saita’s REGAL(learns symbolic concept descriptions)
DELVAUX (learns (numerical) Bayesian rule-sets)
Ch. Eick: Evolutionary Machine Learning
New Trends in
Learning Classifier Systems (LCS)




Holland-style LCS work is very similar to work in reinforcement learning,
especially Evolutionary Reinforcement Learning and an approach called
“Q-Learning”. Newer paper claim that “bucket brigade” and “Q-Learning”
are basically the same thing, and that LCS can benefit from recent
advances in the area of Q-learning.
Wilson accuracy-based XCS has received significant attention in the
literature (to be covered later)
Holland stresses the adaptive component of “his invention” in his newer
work.
Recently, many Pittsburgh-style systems have been designed that learn
rule-based systems using evolutionary computing which are quite
different from Holland’s data-driven message passing systems such as:
•
•
•
•

Systems that learn Bayesian Rules or Bayesian Belief Networks
Systems that learn fuzzy rules
Systems that learn first order logic rules
Systems that learn PROLOG style programs
Work somewhat similar to classifier systems has become quite popular in
field of agent-based systems that have to learn how to communicate and
collaborate in a distributed environment.
Ch. Eick: Evolutionary Machine Learning
Important Parameters for XCS
XCS learns/maintains the following parameters for all its classifiers during
the course of its operation:
 p is the expected payoff; has a strong influence (combined with the
rule’s fitness value) if a matching classifier’s action is selected for
execution.
 e is the error made in predicting the payoffs
 F (called fitness) denotes a classifiers “normalized accuracy” --accuracy is the inverse of the degree of error made by a classifier; F
combined with a determines which classifiers are chosen to be deleted
from the population. Fp determines which actions of competing
classifiers are selected for execution.
 a determines the average size of action-sets this classifier belonged to;
the smaller a/F is the less likely it becomes that this classifier is
deleted.
 exp (experience) counts how often the classifier the classifier belonged
to the action set; has some influence on the prediction of other
parameters --- namely, if exp is low default parameters are used when
predicting the other parameter (especially, for e, F and a)
 Moreover, it is important to know that only classifiers belonging to the
action set are considered for reproduction.
Download