Document

advertisement
ARTIFICIAL INTELLIGENCE
[INTELLIGENT AGENTS PARADIGM]
LEARNING AGENTS
Professor Janis Grundspenkis
Riga Technical University
Faculty of Computer Science and Information Technology
Institute of Applied Computer Systems
Department of Systems Theory and Design
E-mail: Janis.Grundspenkis@rtu.lv
LEARNING FROM
OBSERVATIONS
Learning in intelligent agents is essential for dealing
with unknown environments (compensating for
the designer’s lack of full amount of knowledge
about the agent’s environment).
The idea behind learning is that percepts should be
used not only for acting, but also for improving
the agent's ability to act in the future.
Learning takes place as a result of the interaction
between the agent and the world, and from
observation by the agent of its own decisionmaking process.
LEARNING AGENT (1)
Four conceptual components:
• LEARNING ELEMENT is responsible for
making improvements and for efficiency of the
performance element.
• PERFORMANCE ELEMENT is responsible for
selecting external actions.
• CRITIC is designed to tell the learning element
how well the agent is doing.
• PROBLEM GENERATOR is responsible for
suggesting actions that will lead to new and
informative experience.
LEARNING AGENT (2)
Performance standard
feedback
Learning
element
learning
goals
Problem
generator
Agent
Sensors
changes
knowledge
Performance
element
Effectors
Environment
Critic
THE DESIGN OF THE
LEARNING ELEMENT
Four major issues:
• Which components of the
performance element are to be
improved.
• What representation is used for those
components.
• What feedback is available.
• What prior knowledge is available.
COMPONENTS OF THE
PERFORMACE ELEMENT (1)
1. A direct mapping from conditions
on the current state to actions.
2. A means to infer relevant
properties of the world from the
percept sequence.
3. Information about the way the
world evolves.
COMPONENTS OF THE
PERFORMACE ELEMENT (2)
4. Utility information indicating the
desirability of world states.
5. Action-value information indicating
the desirability of particular actions in
particular states.
6. Goals that describe classes of states
whose achievement maximizes the
agent’s utility.
COMPONENTS OF THE
PERFORMACE ELEMENT (3)
Each of the components can be learned, given the
appropriate feedback.
For example, if the agent does an action and then
perceives the resulting state of the
environment, this information can be used to
learn a description of the results of actions (the
fourth component).
If the critic can use the performance standard to
deduce utility values from the percepts, then
the agent can learn a useful representation of its
utility function (the fifth component).
COMPONENTS OF THE
PERFORMACE ELEMENT (4)
Each of the seven components of the performance
element can be described mathematically as a
function.
Learning any particular component of the
performance element can be seen as learning an
accurate representation of a function.
The difficulty of learning depends on the chosen
representation. Functions can be represented by
logical sentences, belief networks, neural
networks, etc.
COMPONENTS OF THE
PERFORMACE ELEMENT (5)
Learning takes many forms, depending on the
nature of the performance element, the available
feedback, and the available knowledge.
For example, supervised learning – any situation in
which both the inputs and outputs of a
component can be perceived.
If agent receives some evaluation of its actions but
is not told a correct action in learning the
condition-action component, this is called
reinforcement learning.
INDUCTIVE LEARNING
Learning a function (constructing a description of a
function) from a set of input/output examples is
called inductive learning.
An example is a pair (x, f(x)), where x is the input
and f(x) is the output of the function applied to x.
The task of induction is this: given a collection of
examples of f return a function h that
approximates f.
The function h is called a hypothesis.
EXAMPLES AND DIFFERENT
HYPOTHESIS
f(x)
f(x)
f(x)
x
x
a)
x
b)
c)
x
d)
x
e)
The true function f is unknown, so there are many choices for h.
REFLEX LEARNING AGENT
EXAMPLES are (percept, action) pairs.
procedure REFLEX-LEARNING-ELEMENT (percept, action)
INPUTS: percept, feedback percept
action, feedback action
EXAMPLES  EXAMPLES  { (percept, action) }
function REFLEX-PERFORMACE-ELEMENT (percept)
returns an action
IF (percept, action) is in EXAMPLES THEN return action
ELSE call the learning algorithm INDUCE (EXAMPLES) 
h which the agent uses to choose the action
return h (percept)
LEARNING DECISION TREES
A decision tree takes as input an object or situation
described by a set of properties, and outputs a
YES/NO decision.
Decision trees are an effective method for learning
deterministic Boolean functions.
Each internal node in the tree corresponds to a test
of the value of one of the properties (attributes).
Each leaf node in the tree specifies the Boolean
value to be returned if that leaf is reached.
EXAMPLE OF THE DECISION
TREE LEARNING (1)
The problem: whether to wait for a table in
restaurant.
The aim is to learn a definition for the goal predicate
WILL WAIT, where the definition is expressed
as a decision tree.
The examples are described by the following
attributes:
1. Alternate: whether there is a suitable alternative
restaurant nearby.
2. Bar: whether the restaurant has a comfortable
bar area to wait in.
EXAMPLE OF THE DECISION
TREE LEARNING (2)
3.
4.
5.
Friday/Saturday: true on Fridays and Saturdays.
Hungry: whether we are hungry.
Patrons: how many people are in the restaurant (values
are NONE, SOME, and FULL).
6. Price: the restaurant’s price range ($, $$, $$$).
7. Raining: whether it is raining outside.
8. Reservation: whether we made a reservation.
9. Type: the kind of restaurant (French, Italian, Thai, or
Burger).
10. Wait Estimate: the wait estimate by the host (0-10
minutes, 10-30 minutes, 30-60, > 60).
A DECISION TREE FOR DECIDING
WHETHER TO WAIT FOR A TABLE
Patrons?
None
No
Full
Some
Yes
WaitEstimate?
>60
No
0-10
30-60 10-30
Alternate?
No
Yes
Reservation?
No
Yes
Bar?
Yes
Hungry?
No
Yes
Yes
Fri/Sat?
Yes
Alternate?
No
Yes
No
Yes
No
Yes
Yes
No
Yes
Raining?
No
Yes
No
Yes
No
Yes
REPRESENTATION OF THE
DECISION TREE
The tree can be expressed as a conjunction of
individual implications corresponding to
the paths through the tree ending in YES
nodes.
Example: The path for a restaurant full of
patrons, with an estimates wait of 10-30
minutes when the agent is not hungry is
expressed by the logical sentence
r Patrons(r, FULL)  WaitEstimate(r, 10-30)
 Hungry (r, NOT)  WillWait(r)
EXPRESSIVENESS OF
DECISION TREES (1)
The decision trees can not represent any set
because decision trees are implicitly
limited to talking about a single object.
The decision tree language is essentially
propositional, with each attribute test being
a proposition.
The decision trees are fully expressive within
the class of propositional languages, that is,
any Boolean function can be written as a
decision tree.
EXPRESSIVENESS OF
DECISION TREES (2)
Decision trees are good for some kinds
of functions, and bad for others.
n
2
If we have n attributes than there are 2
different functions.
For example, with just six Boolean
attributes, there are about 21019
different functions to choose from.
How to find consistent hypothesis in
such a large space?
INDUCING DECISION TREES
FROM EXAMPLES (1)
An example is described by the values
of the attributes and the value of the
goal predicate.
The value of the goal predicate is called
the classification of the example.
If the goal predicate is true for some
example, it is a positive example,
otherwise it is a negative example.
INDUCING DECISION TREES FROM EXAMPLES (2)
EXAMPLE
Example
Attributes
Alt
Bar
Fri
Hun
Pat
Price
Rain
Res
Type
Est
Goal
WillWait
X1
Yes
No
No
Yes
Some
$$$
No
Yes
French
0-10
Yes
X2
Yes
No
No
Yes
Full
$
No
No
Thai
30-60
No
X3
No
Yes
No
No
Some
$
No
No
Burger
0-10
Yes
X4
Yes
No
Yes
Yes
Full
$
No
No
Thai
10-30
Yes
X5
Yes
No
Yes
No
Full
$$$
No
Yes
French >60
No
X6
No
Yes
No
Yes
Some
$$
Yes
Yes
Italian
0-10
Yes
X7
No
Yes
No
No
None
$
Yes
No
Burger
0-10
No
X8
No
No
No
Yes
Some
$$
Yes
Yes
Thai
0-10
Yes
X9
No
Yes
Yes
No
Full
$
Yes
No
Burger >60
No
X10
Yes
Yes
Yes
Yes
Full
$$$
No
Yes
Italian
10-30
No
X11
No
No
No
No
None
$
No
No
Thai
0-10
No
X12
Yes
Yes
Yes
Yes
Full
$
No
No
Burger 30-60
Yes
CONTRUCTION OF THE
DECISION TREE (1)
Simple solution: construct a decision tree that has
one path to a leaf for each example, where the
path tests each attribute in turn and follows the
value for the example, and the leaf has the
classification of the example.
This is a simple way how to find the decision tree
that agrees with the training set of examples.
When given the example with the same description
again, the decision tree will come up with the
right classification.
CONTRUCTION OF THE
DECISION TREE (2)
The problem with a trivial tree is that it just
memorizes the observations.
It does no extract any pattern from the examples
and so we can not expect it to be able to
extrapolate to examples it has not seen.
Extracting a pattern means being able to describe
a large number of cases in a concise way.
We should try to find a concise decision tree.
CONTRUCTION OF THE
DECISION TREE (3)
This is an example of a general principle of
inductive learning called Ockham’s
razor:
The most likely hypothesis is the simplest
one that is consistent with all
observations (examples).
A simple hypothesis that is consistent with
the observations is more likely to be
correct than a complex one.
FINDING THE SMALLEST
DECISION TREE (1)
The basic idea:
Test the most important attribute first.
The most important is the attribute that makes
the most difference to the classification of
an example.
This way, we hope to get the correct
classification with a small number of
tests, meaning that all paths in the tree will
be short and the tree as a whole will be
small.
FINDING THE SMALLEST
DECISION TREE (2)
+: X1, X3, X4, X6, X8, X12
-: X2, X5, X7, X9, X10, X11
Type?
French
+: X1
-: X5
Italian
+: X6
-: X10
Thai
+: X4, X8
-: X2, X11
Burger
+: X3, X12
-: X7, X9
FINDING THE SMALLEST
DECISION TREE (3)
+: X1, X3, X4, X6, X8, X12
-: X2, X5, X7, X9, X10, X11
Patrons?
None
+:
-: X7, X11
Some
+: X1, X3, X6, X8
-:
Full
+: X4, X12
-: X2, X5, X9, X10
FINDING THE SMALLEST
DECISION TREE (4)
+: X1, X3, X4, X6, X8, X12
-: X2, X5, X7, X9, X10, X11
Patrons?
None
+:
-: X7, X11
No
Some
+: X1, X3, X6, X8
-:
Full
+: X4, X12
-: X2, X5, X9, X10
Yes
Hungry?
Y
+: X4, X12
-: X2, X10
N
+:
-: X5, X9
FINDING THE SMALLEST
DECISION TREE (5)
Consider all possible attributes and
find the most important one.
After the first attribute test splits up
the examples, each outcome is a
new decision tree learning
problem in itself, with fewer
examples and one fewer attribute.
FINDING THE SMALLEST
DECISION TREE (6)
FOUR CASES
1. If there are some positive and some negative
examples, then choose the best attribute to
split them.
2. If all the remaining examples are positive (or
all negative), answer YES or NO respectively.
3. If there are no examples left (no such examples
has been observed), return a default value
calculated from the majority classification at
the node’s parent.
FINDING THE SMALLEST
DECISION TREE (7)
4. If there are no attributes left, but both positive
and negative examples, it means that these
examples have exactly the same description,
but different classification.
This happens when some of the data are incorrect (there
is noise in the data).
It also happens when the attributes do not give enough
information to fully describe the situation, or when
the domain is truly nondeterministic.
One simple way out of the problem is to use a majority
vote if no more attributes can be used.
+: X1, X3, X4, X6, X8, X12
-: X2, X5, X7, X9, X10, X11
None
+:
-: X7, X11
Patrons?
Some
+: X1, X3, X6, X8
-:
No
FINDING THE SMALLEST
DECISION TREE (8)
Full
+: X4, X12
-: X2, X5, X9, X10
Yes
Hungry?
Yes
No
+: X4, X12
-: X2, X10
+:
-: X5, X9
Type?
French
Yes
Italian
+:
-: X10
No
No
Thai
Burger
+: X4
-: X2
+: X12
-:
Fri/Sat?
No
Yes
+:
-: X2
Yes
+: X4
-:
No
Yes
FINDING THE SMALLEST
DECISION TREE (9)
CONCLUSIONS
The learning algorithm looks at all examples, not at
the correct function, and in fact, its hypothesis
shown as the last decision tree not only agrees
with all the examples, but is considerably
simpler than the original tree.
The learning algorithm has no reason to include
tests for RAINING and RESERVATION, because
it can classify all the examples without them.
If we were to gather more examples, we might
induce a tree more similar to the original.
ASSESSING THE PERFORMANCE OF
THE LEARNING ALGORITHM (1)
A learning algorithm is good if it produces
hypothesis that do a good job of predicting
the classification of unseen examples.
A prediction is good if it turns out to be true,
so we can assess the quality of a hypothesis
by checking its predictions against the
correct classification once we know it.
It is done on the test set.
ASSESSING THE PERFORMANCE OF
THE LEARNING ALGORITHM (2)
THE METHODOLOGY
1. Collect a large set of examples.
2. Divide it into two distinct sets: the training set
and the test set.
3. Use the learning algorithm with the training
set as examples to generate a hypothesis.
4. Measure the percentage of examples in the test
set that are correctly classified by a hypothesis.
5. Repeat steps 1 to 4 for different sizes of training
sets and different randomly selected training
sets of each size.
ASSESSING THE PERFORMANCE OF
THE LEARNING ALGORITHM (3)
The key idea of the methodology is to
keep the training and test data
separate.
The result of the application of the
methodology is a set of data that can be
processed to give the average
prediction quality as a function of the
size of the training set.
ASSESSING THE PERFORMANCE OF
THE LEARNING ALGORITHM (4)
A learning curve
% correct 1
on test set
0
20
40
60
80
100
As the training set grows, the
prediction quality increases.
Training set
size
Download