Presentation - Learning Agents Center

advertisement
Learning through Interactive
Behavior Specifications
Tolga Konik
CSLI, Stanford University
Douglas Pearson
Three Penny Software
John Laird
University of Michigan
1
Goal

Automatically generate cognitive agents

Reduce the cost of agent development

Reduce the expertise required to
develop agents.
2
Domains




Autonomous Cognitive agents
Dynamic Virtual Worlds
Real time decisions based on knowledge
and sensed data
Soar agent architecture
3
Learning by Observation

Approach:



Observe expert behavior
Learn to replicate it
Why?


We may want human-like agents
In complex domains, imitating humans
maybe easier than learning from scratch
4
Bottleneck in pure Learning by
Observation

PROBLEM:


You cannot observe the internal reasoning of the
expert
SOLUTION:

Ask the expert for additional information


Goal annotations
Use additional knowledge sources

Task & domain knowledge
5
Learning by Observation
Expert
Goal
annotations
Interface
Actions
Environment
Percepts
Learner
Agent
Additional Task
Knowledge
6
Learning by Observation
Agent
Interface
Environment
ILP 2004
Machine Learning Journal (forthcoming)
7
Learning by Observation
Critic Mode
Interface
Agent
Environment
critic
Learner
Expert
8
One Body, Two Minds
?
?
Interface
Agent


Expert
Environment
How and when to switch control
How the expert and the agent
program communicate
9
Diagrammatic Behavior
Specification
Expert
Environment
Redux
Agent
Learner
10
Redux


Visual rule editing
Diagrammatic Behavior Specification
11
r2
Goal Hierarchy
d3
d4
r3 i3
d2
d1
r1
i4
d5
d6
r4
Get-item(Item)
Get-item-in-room(Item)
Get-item-different-room(Item)
Goto-next-room
Go-to-door(D)
Go-to(Door)

Go-through(Door)
Task-Performance knowledge is represented with a
hierarchy of durative goals.
12
r2
Goal Hierarchy
d3
d4
r3 i3
d2
d1
r1
i4
d5
d6
r4
Get-item(i3)
Item=i3
Get-item-in-room(Item)
Get-item-in-room(i3)
Get-item-different-room(Item)
Goto-next-room
Go-to-door(D)
Go-to(Door)
Go-through(Door)
13
r2
Goal Hierarchy
d3
d4
i3
r3
d2
d1
r1
i4
d5
d6
r4
Get-item(i3)
Item=i3
Get-item-in-room(Item)
Door=d1
Get-item-different-room(Item)
Get-item-different-room(i3)
Go-to(Door)
Go-to(d1)
Go-through(Door)
14
r2
Goal Hierarchy
d3
r3 i3
d4
d2
d1
r1
i4
d5
d6
r4
Get-item(i3)
Get-item-in-room(Item)
Goto-next-room
Get-item-different-room(i3)
Door=d1
Go-to-door(D)
Go-to(Door)
Go-through(d1)
15
Behavior Specification
Expert

Agent

Expert draws initial abstract situation
Create senario by selecting actions
17
Goal Specification
Expert

Agent

Goals are explicitly selected
The agent contributes based on the
current situation, current goal and its
knowledge
18
Goal Hierarchy

Learning by Observation perspective


Learning Perspective



Unobservable mental reasoning of the expert
Bias hypothesis space
“learn agent” problem reduced to “learn goal
selection and termination”
MI Perspective

information exchange between the expert and
the agent
20
Relevant Knowledge Specification
Prepare
food
Expert

Agent
Expert can mark important
objects in a decision
21
Rich Behavior Trace


Expert specified undesired actions and goals
Expert rejected actions and goals of the
approximately learned agent program
Watch
TV
22
Rich Behavior Trace

Hypothetical Actions and Goals

Situation history : a tree structure of possible
behaviors
23
Relational Learning by Observation

Input:





Output:




Relational Situations
Goal and action selections and rejections
Additional annotations (i.e. important objects)
Background knowledge
Rule based agent program
Learn goal/action selection/termination
generalizing over multiple examples
Inductive Logic Programming to combine rich
knowledge structures
24
Relational Learning by Observation
25
Relational Learning by Observation
Find the common structures in the decision examples
26
Relational Learning by Observation
?
Learn relations between
what the agent wants,
perceives and knows.
“Select a door in the current room, which leads to a
room that contains the item the agent wants to get”
27
Summary
Diagrammatic behavior specification approach:

To extract rich behavior knowledge

Interactive behavior specification

Communication medium between the agents
(explicit goals and assumed situation)

Relational learning by observation approach
to combine multiple complex knowledge
sources
32
Future Work



Improve mixed initiative interaction of the
interface
Explore domain independent diagrammatic
interface features
Allow the expert to enter context sensitive
knowledge
33
Mixed initiative perspective


Interactive behavior specification
Diagrammatic representation of behavior


communication medium between the agents
Explicit goals and desired behavior

Facilitates interaction between the agents
34
Download