door help .. base

advertisement
Thrust IC: Action Selection in
Joint-Human-Robot Teams
Nick Roy (lead)
Cynthia Breazeal
Rod Grupen
MURI 8
Kickoff Meeting 2007
Task Objective

Objective: develop a robust and interactive planner that incorporates uncertainty
in the human cognitive model, world state, world dynamics, etc.

Human-robot team receives task assignment from dynamic task allocation
algorithm
Within each task, human and robot must choose actions to accomplish the task
robustly
Robots have two major decision-making tasks:





Actions required to complete the current task
Actions required to share information
Given perfect knowledge of the current true state of the human, choosing the
correct action may be relatively easy. But even in the presence of perfect
sensing, the state of the human-robot system cannot be known exactly.
MURI 8
Kickoff Meeting 2007
MIT-Vanderbilt-Stanford
UW-UMASS Amherst
The Action Selection Problem

Natural human robot collaboration leads to several
challenges within teams:
Let's go inside the bank.
 Lack of shared knowledge or spatial awareness
 Lack of common representations of knowledge and
linguistic ambiguities (e.g., different vocabularies)
 Noisy signals (vision or speech recognition)

Teams must accept tasks (i.e., local objective
functions) from the task assignment algorithm,
leading to challenges between teams:
 Lack of shared knowledge or spatial awareness due
to communication constraints
 Lack of common representation with task assignment
algorithm due to computation and communication
constraints
Going to the river bank...
Existing Technology
..
door help .. base
observation model
 Partially Observable Markov
Decision Processes
 Hidden states: the human
intentional state (may
include situational
awareness, local task, etc.)
 Observations: what the
robot hears and sees
 Actions: movements and
queries the robot can do
 Reward Model R(s,a)
 Transition Model T(s'|s,a)
 Observation Model O(o|s,a)
Going
inside
building
Start
Helping
the
injured
Going
back to
base
End
Existing Technology
 POMDP tracks a belief, a probability distribution over states.
 Actions are selected based on this belief, thus taking into
account uncertainty about what the human is really doing.
probability
best action:
go to the base
base help
bldg
 POMDPs have been used in several human-robot interaction
applications, such as Roy, Pineau, and Thrun (2000) and
Williams and Young (2005)
Technical Limitations
 Existing planning algorithms and models are singleagent, single-human
 Most planning is focused on a single user goal, not
multiple goals and constraints
 Existing algorithms have used simple models of
human intentional states and natural language
 Existing algorithms use a priori models, not learned
models
MURI 8
Kickoff Meeting 2007
Technical Advances
 Generalization of planning algorithms and models to
multi-person teams
 Technical challenge: scaling the computation to large action
spaces
 Generalization of planning algorithms to multiple
objectives
 Technical challenge: identifying problem representations that
allow the system to share state with a dynamic task
allocation algorithm
MURI 8
Kickoff Meeting 2007
Technical Advances
 Incorporate human intentional states and rich models
of natural language
 Technical challenge: cognitive model will give rise to
exponential growth in state space, and rich natural language
models will give rise to exponential growth in observation
space
 Existing algorithms use a priori models, not learned
models
 Technical challenge: providing action policies to allow
system to behave reasonably even when model is unknown
or uncertain
MURI 8
Kickoff Meeting 2007
Year 1 Milestones
 Demonstrate coupling of natural language and joint
action selection with robot system
 Demonstration of interaction with human teammate in
medical triage scenario
 Human teammates instruct robot to assist with a victim in a
single example task
 Human assigns specific tasks to robot, to be performed
autonomously and independently (e.g., give triage tag to
another victim)
MURI 8
Kickoff Meeting 2007
Year 2 Milestones
 Demonstrate integration of task allocation, cognitive
models, state estimation and joint action selection
 Demonstration of interaction with human teammate
as part of larger system
 Human-robot team receives task from cool-zone
 Human and robots negotiate task division
 Robot offers to help with unspecified tasks (e.g., robot offers
to fetch toolkit for teammate)
 Robot provides additional information from remote station
(e.g., warns human teammate of scenario change)
MURI 8
Kickoff Meeting 2007
Year 3 Milestones
 Demonstrate integration of learning and joint action
selection
 System evaluation
 Demonstration of human-robot training, and learning
of joint-action models
 Robot learns vocabulary, behavioral patterns, etc.
of human team-mates
 Uses learned models to improve performance in
the field
MURI 8
Kickoff Meeting 2007
Download