Integrating Deliberation in an Intelligent Agent Architecture

advertisement
From: AAAI Technical Report SS-95-04. Compilation copyright © 1995, AAAI (www.aaai.org). All rights reserved.
Integrating Deliberation in an Intelligent AgentArchitecture
R. Peter Bonasso
MobileRoboticsLaboratory
ER4, NASA-JSC
Houston, "IX 77058
Coonasso@
aio.jsc.nasa.gov)
Abstract
Wehavebeenpursuingthe integration of a state-based
planneras the deliberativetop layer of a three-tiered
robot control architecture. The middle layer is
implementedin the RAPsystem, while the bottom
layer consistsof a suite of reactiveskills whichcan be
configured by the RAPssystem into Brooksian
machines.Ourtarget environment
has beenthe ground
control center of a space station command
center. Our
current system has been shownto provide a higher
level of human
supervisionthat preservessafety while
allowingfor task level direction, reaction to out-ofnormparameters,and humaninterventionat all levels
of control. For this workshop,wediscuss the issues
we have been addressing with regard to where the
boundary
lies betweenlocal reactive control andlongrange deliberation. Suchissues include whetherthe
seat of control shouldbe at the RAPs
systemor at the
planner, interruption of the RAPssystem by the
planner due to priority changes, howmuchof the
planner’sinstruction can be taken as guidanceversus
direction, and levels of humaninteraction betweenthe
planner and the RAPssystem.
Background and Motivation
Since the late eighties we have investigated ways to
combinedeliberation and reactivity in robot control
architectures [Sanbornet ai 1989, Bonasso91, &Bonasso
et al 92], in order to programrobots to carry out tasks
robustly in field environments.Therobot control software
architectureweare usingis an outgrowth
of severallines of
situated reasoningresearchin robotintelligence [Firby 89,
Gat91, Connell 91, Slack 92, Yuet al 94, Elsaesser &
Slack94], andhas provenuseful for enablingmobilerobots
to accomplish tasks in field environments. This
architecture, whichis similar to ATLANTIS
[Gat 91] and
Cypress[Wilkins et al 94] in its concept if not in its
implementation,separates the general robot intelligence
probleminto three interactingpieces (see Figure1):
o Aset of robotic specific reactive skills. For example,
grasping, object tracking, andlocal navigation. Theseare
tightly boundto the specific hardwareof the robot and
mustinteract withthe worldin real-time.
o A sequencing capability whichcan differentially
activate the reactiveskills in orderto direct changesin the
state of the world and accomplishspecific tasks. For
example,exiting a roommightbe orchestrated throughthe
use of reactive skills for doortracking, local navigation,
grasping,andpulling. In eachof these phasesof operation,
the skills of the lower-levelare connectedto function as a
whatmightbe called a "Brooksian"robot - a collection of
networked state machines. Weare using the Reactive
ActionPackages(RAPs)system[Firby 89] for this portion
of the architecture.
o Adeliberative planning capability whichreasons in
depth about goals, resources and timing constraints. We
are using a state-based non-linear hierarchical planner
knownas AP[Elsaesser & MacMillan91], the adversarial
planner,since it grewout of researchin counterplanning.
APis a multi-agentplanner whichcan reasonabout metric
time for scheduling,monitorthe executionof its plans, and
replan accordingly.
Thesecapabilities allow a robot, for example,to accept
guidance from a humansupervisor, plan a series of
activities at various locations, moveamongthe locations
carryingout the activities, andsimuitaneously
avoiddanger
andmaintainnominalresourcelevels.
Wehavebeensuccessful in applyingthe reactive portion
of this architectureto mobileland [e.g., Bonassoet al 92]
and undersearobots [Bonasso& Barrett 93]. Recently, we
haveintegrated the plannerto control a free-flying, twoarmedmanipulatorsystemin support of the maintenanceof
NASA’s
plannedspace station. This paper discusses issues
associatedwith integrating planningwith reactive control.
Suchissues includewhetherthe seat of control shouldbe at
the RAPssystem or at the planner, interruption of the
RAPssystem by the planner due to priority changes, how
muchof the planner’sinstruction can be taken as guidance
versus direction, and levels of humaninteraction between
the planner and the RAPssystem.
Automated Robotic Maintenance of Space
Station (ARMSS)
Wehave recently been using the architecture as a
frameworkfor controlling from the ground a two-armed
manipulatorrobot maintaininga space station on orbit. The
idea is that an intelligent groundcontrolstation can enable
a ground crew to supervise the routine maintenance
activities of the robot, andthus allowthe on-orbitpersonnel
z
=
.I
ir
r
.~
COMMITMENT
ACTION
RESPONSE~
TIME
ABSTRACTION
to
~
[~
Jr
/
w
Figure1: TheSubjectIntelligent RobotControlArchitecture.
Commitment
to action: At the skill level, skills are active only for the phase of the
activity for which they are relevant. Thesequencer has a longer commitment
to the
sequence of phases, and the planner has an even longer commitment
to the overall
plan.
ResponseTime: The skills operate normally at from 10 to 30 hz. The sequencer
switches methodson the order of fractions of seconds to minutes, while the planner
operates in the minutesto hours time frame.
Abstraction: The skill level is groundedin the sensor signals and motorcommands
of the physical robot. The sequencerabstracts these primitives to routine
procedures, and the planner abstracts the proceduresto the level of the overaUgoal.
22
to concentrateon scientific missions. Weare using a 3-D
kinematic simulation of such a robot, the EVA
Helper/Retriever (EVAHR),
to investigate our planning
issues,
A subset of these issues can subsequentlybe examined
in a testhod at NASA-JSC
knownas the ARMSS
facility.
ARMSS
consists primarily of two 7 DOFRobotic Research
manipulators, each mountedand moveableon vertical
towers whichin turn can movehorizontally via a floor
gantry. Thesearmscan reach a portion of a space station
truss on whichare mounteda variety of componentsthat
can be inspected or repaired. Cameraviews are available
fromeach endeffector and fromthe towersand the floor.
Existing ARMSS
"skills" include inverse kinematic
trajectory generation, velocity and position sensing and
control, force/torquesensingand control, andratchet and
gripperconlrol.
The Maintenance Scenario
The generic scenario for EVAHR
(approximatedphysically
in the ARMSS
facility) is that eachday it mustcarry out
set of routine maintenance
tasks dictated by the status of
components
on the outside of the station. Arepair plan is
generatedby the planner(at groundcontrol). Sucha plan
mightconsist of inspections of items at various sites,
retrieval of brokencomponents
or positioningof previously
repaired items, or the simplechangeoutof malfunctioning
items. Theplan is executedby the RAPssystem(on-orbit)
and monitoredby the planner. At any point, the plan may
need to be altered by a major unplannedevent such as a
reboost of the space station to achievea neworbit or the
temporaryloss of station power,or by minorinterruptions,
such as an astronaut needinghelp on an EVA
mission.
Moreover,the planner must adjust the current plan by
taking into account what is transpiring as the plan is
executed. The RAPssystemallows the robot to carry out
most plan steps evenin the face of minorproblems,but
those problemsmust be noted by the planner whichmay
needto modifyfuture plan steps (see Dealingwith Planner
Instructionsbelow).
Implementation
Integrating the APsystemwith the RAPssystem, while not
trivial, has beenrelatively straightforwardwith regardto
knowledge representation
and timing. AP runs
asynchronouslyfromthe rest of the architecture. Having
generatedthe plan, the planner,in its executionmonitoring
mode,invokesa RAPfor executingthe first plan step. A
typical APprimitive mightbe:
(Operatorload-at-site
:purpose(item-loaded?arm?site-item)
:arguments
( (?site
(get-site-from-memory
?site-item)))
:preconditions
( (attached-to7planner7site)
:effects ( (item-loaded7arm?site-item)
(palette-mount-for7site-item 7mount)
(attached-to ?site-item ?mount)
(arms-status?plannerunfolded)
:task-timeduration-of-load-at-dock
)
Thereis oneor morereactive action packages(RAPs)
methodsto carry out this primitive in the RAPlibrary.
These are indexed via the purpose, preconditions, and
possiblythe effects clauses of the APprimitive. Atypical
RAPmight be
(define-rap(load 7arm?item)
(succeed (and (palette-mount-for?item ?mount)
(attached-to 7item?mount)))
(method normal-method
?ann?anywhere)
(context(and (arm-place
(not(=?armdock-arm))
(not(arm-holding
?ann?any))
(not(or
(and 0ust-result ?Arm?result)
(--7result MALFUNCTION))
(and (last-result gripper?arm?res)
(ffires
GRIP_MALFUNCTION))))))
(task-net
(tl (evahr-position-body
?item)(for
(12(arm-extract
?arm
?item)
(for
(t3(arm-install
?arm
?item))))
(method
broken-ann-method
(context (and (or (and (last-result ?Arm?result)
(= ?result
MALFUNCTION))
(and (last-result gripper?arm
?re,
s)
(=?res
GRIP_MALFUNCTION)))
(arm-place
?other-ann
?anywhere)
(not(or (=7other-arm?ann)
(= ?other-Arm
dock-arm)))
(not(arm-holding
7other-ann
?any))))
(task-net
(tO(disable-arm-move)
(for
(tl(disable-lincar-arm-move)
(for
(12(evahr-position-body
?item)
(for
(t3(arm-extract
7other-arm
7item)
(for
(t4(arm-install
7other-arm
7item)))))
APhas a memory
of the class structure of the worldand
someglobal items, but the majority of the posting and
reporting takes place in the RAPmemory.As this example
should show, since both APand RAPsuse a propositional
formof knowledgerepresentation, an interlingua such as
ACTsin Cypress is unecessary. So implementing the
integration from this standpoint was more of an
engineeringexercise. However,
dealing with other issues is
notso clear.
Issues Involved In Integrating Deliberation
Thefollowingis a discussionof someof the issues weare
addressing in our research beyondmakingtwo systems
communicate
with each other.
Who’s In Control?
Gat, in ATLANTIS,
put forth the notion that in reactive
agent architectures, the local reactive intelligence (the
sequencer)shouldtreat the planneras an expert of sorts,
invokingit whensomedeliberation is necessaryor whenit
is at a loss for somethingto do. This stand mayhavebeen
an outgrowthof the use of path planners whereinthere is
little if anyinteractionbetweensubtasks.Butfor achieving
conjunctive goals whichhave interacting subgoals, we
believe that the sequencerdoesn’t haveenoughinformation
to knowwhento requestsomedeliberation. Norcan it infer
whenit shouldgive up on a local task becauseof global
constraints.
A representative exampleconcernstime management.
If
the sequenceris makingprogresson a task it will continue
workingthe task eventhoughit is taking several standard
deviations moretime than is normal.Thesequencercould
be givena time limit of course, but whetherthat time limit
should be a hard one dependson what else is going on
globally. Theplannercan monitorthe time on the task and
chooseto a) allowthe task to continuebecausethe task was
started ahead of schedule due to unexpectedsuccess on
other tasks, b) abandonfuture tasks of lowerpriority in
favor of the current task, or c) dictate the abandonment
of
the currenttask in favorof higherpriority futuretasks.
Local Task Interruptions
Because safety and robustness are hallmarks of our
architecture, weare especially concernedwith the needto
interrupt local task executionwithoutleavingthe robot in
an awkwardstate. Since the RAPssystem has knowledge
of howto halt a task andgracefullysafe the robot, oneway
to interrupt a task is for the plannerto post an alert tag to
the RAPmemory.Eventually,the RAPinterpreter will note
the flag and stop the task while sating the robot. But in
someinstances -- e.g., an emergency
boost of station to a
neworbit -- such "cleanup"operationsas folding the arms
needto be omittedin orderto get the robotto a securedock
site. Therobotis an expendable
in the light of certain crew
emergencies,and the planner must recognize this and be
able to commandthe equivalent of "power downand
detach"(the equivalentof beingcast adrift) in these cases.
Dealing with Planner Instructions
In our scenarios, the planner considers each arm and
cameraas a system resource, and thus selects whicharm
and/or camerato use for each maintenancetask based on
wearandtear criteria. Duringa repair task, the selectedann
can fail in a numberof ways(our simulation allows the
interactive introductionof anomaliesduringexecution).If
the primary modeof the selected arm fails, RAPswill
simply switch to the redundant modeand complete the
current task. But the planner then needs to insure that
future plan steps do not use that arm as the primary
resource. If the selected armfails both modes,RAPswill
switch arms, but the planner will need to makemajor
adjustmentsto the plan, i.e., no two-armedtasks can be
undertaken.
But there is a moresubtle problem. The point of the
RAPsis to keepa certain amountof detail hiddenfromthe
planner so as to minimize planner computations. The
planneressentially directs that an item be loadedand that
the right armought to be used. For instance, whatff the
local situation is suchthat the selectedarmis not feasible
to use? This can occurwhena specific item is reachablefor
exampleby one armand not the other due to the positions
of itemsin its deliverypalette. So the questionbecomes
in
this case whether to include configuration space
computationsin the planner’srepetoire. Weprefer not to,
but the result can be an infeasible suggestionof resources
fromonelayer to the other.
Where to Interact With the Robot?
The architecture was designed to movethe humanfrom
teleoperation to task supervisionas one movesfromskills
to sequencingto planning. So it wouldseemthat the point
of interactionof the human
in the three-tieredsystemis the
planner. Yet, to understand wherethe robot is in its
progress, the humanmaywish to knowsomethingabout
the state of execution belowthat of the planner; for
example,whichstep in the current RAPis being executed.
A pass-through could be effected for the human-robotinteraction at the plannerlevel to querythe RAPsystemfor
this information.
Oftenhowever,a RAPwill fail to achieve the planner’s
purpose because of a mechanical situation which the
humancould remedybut not at the task level. In these
situations, the planner usually has no recourse but to
abandonthe task. Thehumanhowever,given an interface,
can execute certain low-level RAPsto help the robot
recover. An exampleis whencommunications
betweenthe
RAPsand the skills breaks downmomentarily. With an
interface at the RAPslevel, the humancan intervene and
restore the commmunications
and then restart the current
RAP. We have implemented a simple human-robot
interface at both the planningand RAPslevel to evaluate
their use.
Summary
Integrating planningwith a reactive architecture can be
greatly facilitated by similar knowledge-representations.
Wehave discussed an agent architecture instance where
this is true. However,deeper issues are involved,
particularly with environmentsthat are dynamicnot only
locally, but globallyas well. Wehaveidentified several of
these issues and we are learning more about our
architecture through implementationon morerobots and
different scenarios. Webelieve our application
environmentis an excellent forcing function for
investigating
theseissues.
Jim Sanborn,B. Bloom,and D. McGrath.A Situated
ReasoningArchitecturefor Space-basedRepair and
Replace Tasks, In 1989GoddardConferenceon Space
Applicationsof Artificial Intelligence, Greenbelt,MD.
NASAPub # 3033.
References
R. Peter Bonasso.Integrating ReactionPlans and Layered
CompetencesThroughSynchronousControl. In
Proceedingsof the 12th International Joint Conference
on Artificial Intelligence. Sydney,Australia. Morgan
Kaufman.
M. G. Slack. SequencingFormallyDef’medReactionsfor
RoboticActivity: Integrating RAPS
and GAPPS.
In
Proceedingsof the SPIEConferenceon SensorFusion,
November1992.
R. Peter Bonasso,HJ. Antonisse,M.G.Slack.. A Reactive
RobotSystemfor Find and Fetch Tasks In AnOutdoor
Environment.
In Proceedingsof the TenthNational
Conference
on Artificial Intelligence. SanJose, CA.
July 1992.
Wilkins, DavidE., Myers,KarenL., Lowrance,John D.,
and Wesley,LeonardP. Planningand Reactingin
Uncertain and DynamicEnvironments.Journal of
Experimentaland TheoreticalAI, Vol 7. 1994.
S. Yu, M.G. Slack, andD. P. Miller. Astreamlined
softwareenvironment
for situated skills. In
Proceedings of the AAIA/NASA
Conferenceon
Intelligent Robotsin Field, Factory,Serviceand
Space, March1994.
R. Peter Bonasso,R_P.& Barrett, J.A. AReactiveRobot
Systemfor Find and Visit Tasks in a DynamicOcean
Environment.
In Proceedingsof the 8th International
Symposium
on Unmanned
Untsthered Submersible
Technology,IEEEOceanographicSociety, Sep 1993.
J. H. Connell.Ahybridarchitectureappliedto robot
navigation.In Proceedings
of the IEEEInternational
Conferenceon Roboticsand Automation,April 1992.
C. Elsaesser. Representationand Algorithmsfor
Multiagent Adversarial Planning, MTR-91W000207,
The MITRE
Corporation, Dee1991.
C. Elsaesser&M.G.Slack. DeliberativePlanningin a
RobotArchitecture.In Proceedingsof the
AAIA/NASA
Conferenceon Intelligent Robotsin
Field, Factory, ServiceandSpace, March1994.
JamesR. Firby. AdaptiveExecutionin ComplexDynamic
Worlds. PHDDiss. YALEU/CSD/RR
#672, Yale
University.1989.
ErannGat. Reliable Goal-DirectedReactiveControl of
Autonomous
MobileRobots. PhDthesis, Virginia
PolytechnicInstitute Department
of Computer
Science,April 1991.
25
Download