Layered Cognitive Architectures: Where Cognitive Science Meets Robotics Pete Bonasso*, Mike Freed**

advertisement
Layered Cognitive Architectures: Where Cognitive Science Meets
Robotics
Pete Bonasso*, Mike Freed**
*NASA Johnson Space Center, TRACLabs
1012 Hercules, Houston TX, 77058, USA r.p.bonasso@jsc.nasa.gov
**NASA Ames Research Center, Cognition Lab
Moffett Field, CA, 94035, USA mfreed@mail.arc.nasa.gov
Abstract
Although overlooked in recent years as tools for research,
cognitive software architectures are designed to bring
computational models of reasoning to bear on real-world
physical systems. This paper makes a case for using the
executives in these architectures as research tools to explore
the connection between cognitive science and intelligent
robotics.
Top
Middle
Deliberative
(Planning/Scheduling)
Executive
(Task level Sequencing)
Cognitive Architectures
Three-layer intelligent control architectures (Gat, 1998) are
by now well-known and are taught in many graduate level
AI courses (e.g., (Murphy, 2000)). Usually presented as a
framework for organizing software, and reported as
engineering tools for bringing AI to bear on robotic
applications (e.g., (Bonasso et al., 2003; Bonasso et al.,
1997b; Freed et al., 2004), an important aspect of these
architectures is the fact that they were motivated to
integrate computational models of reasoning with realworld control of physical devices. In particular, the
executive layer serves as a syntactic and semantic
differential between the representations at the top layer and
those of the continuous low-level control. Our claim is that
these executives, in combination with representative
application simulations, provide a common proving ground
for cognitive science and AI robotics research.
The executive is shown in the organization of a threelayer cognitive architecture, in Figure 1. Typically, deep
reasoning in the way of generative planning and scheduling
executes at the top tier, predicting tasks required to achieve
control objectives given an initial situation, and assigning
to them available resources. The lowest tier transforms
primitive actions into continuous control and monitoring
activity for the hardware. The executive tier takes highlevel goals from the top tier (or from a human agent),
decomposes them from a library of reactive-plans and
ultimately transforms them into primitives for the bottom
tier. Control processing is closed loop at all tiers,
permitting execution monitoring and reactive re-planning
and reconfiguration in response to changes in the
environment or the robotic hardware.
Gluing Cognition to Sensing and Action
The three layers correspond to what Newell termed the
neural band, the cognitive band and the rational band
Bottom
Low level control
Low level control
Low level control
• Event driven
• Computationally intensive
• Large time constants
• Converts goals to sequences
of continuous activities
• Interprets sensing as events
• Intermediate time constants
• Continuous
• Sense/act processes
• Small time constants
Hardware
Hardware
Hardware
Figure 1 Three-layer
Architecture
Cognitive
(Newell, 1990) . In Newell’s theory, the cognitive band has
multiple levels of knowledge processing from symbol
access to goal attainment. And indeed, the executives in
layered cognitive architectures must perform this range of
cognitive processing in order to translate the form and
intent of the top level (or human requests) to the
continuous activity of the bottom level, as well as interpret
the sensory events from the bottom level in the context of
the goals of the top level.
One of the first of these executives was the RAPs system
(Firby, 1989; Firby, 1999). A typical RAP procedure,
shown below, defines behavior for using a camera to
visually locate and then move to a human. The
applicability of the behavior depends on the whether the
human’s location is already known, in which case the use
of the camera is not required, and on the availability of the
camera resource if needed. Here we see the tie from a
state-based goal to the sensing and acting functions of the
robot, together with a first order conjunctive query. RAPs
is now embodied in a larger framework including a concept
memory and a language parser for use in dialogue
management (Fitzgerald and Firby, 1998). Thus, it is not
only a control framework but also a human-robot
interaction framework.
PDL syntax is similar to RAPs (the wait-fors are
preconditions rather than post-conditions), but Apex has
built-in algorithms for assigning and scheduling resources,
such as hands and eye gaze. As Apex executes, tasks
continuously compete for resources, depending on their
precedence (rank), importance, urgency and cost of
interruption. In this way, Apex can model, for example,
how a human driver shifts from a standard eye-scan to
focusing on an accident up ahead and then back to standard
scanning.
The power of using these frameworks for execution lies
in the fact that execution is a problem area where looking
at humans is likely to shed light on how to better do AI
robotics, and at the same time is an area where success on
the AI robotics side should translate to a better
understanding of human cognition.
body of procedural knowledge and process it in fairly
straightforward ways. In contrast, deliberative AI
mechanisms make assumptions about the kind of
knowledge employed – usually highly granular, purposeindependent models – and the way it is processed, e.g.,
computationally demanding search. We contend that these
are not very plausible as human cognitive models
Then too, the functionality requirements of these
executives are driven by such factors as the need to exploit
regularities in the task environment, the need to cope with
uncertainty about current state, past state, future state,
action outcomes, task resource requirements, task duration,
etc., as well as time constraints imposed by the task and/or
environment. These seem likely to be prime shaping
factors in the development of human cognition. In contrast,
deliberative functionality requirements are generally driven
by the need to solve well defined but difficult, highly
coupled problems under fairly strong assumptions of
certainty and under relatively little time-pressure. It's not
clear that such capabilities are substantial drivers in the
development of human cognition, nor, for that matter, that
people are very proficient at this kind of cognition.
On a practical level, the kinds of tasks executives were
designed and used for correspond to what in humans would
be seen as commonplace and natural behaviors (for an
early reference to this notion see (Agre, 1988)). This
aspect makes an executive a good basis for studying and
modeling natural decision-making and behavior tasks. In
contrast, deliberative algorithms are designed for the kinds
of tasks people do infrequently and often need help with,
e.g., playing chess, deciding utilities among alternatives.
Furthermore, the kinds of tasks dealt with by these
executives are always grounded in the real world. So, the
agent must reason about real world phenomena such as
geometry, action and reaction, and the translation of signals
into meaning at the cognitive level. We contend that
human cognition is itself grounded in and informed by
sensory experience, something more deliberative models of
cognition tend to ignore.
This latter point is brought out in the use of the 3T
layered control architecture, which employs the RAPs
executive (Bonasso et al., 1997a). In over a decade of
using 3T in robotic and life-support applications, only
twice was the deliberative layer called into play
((Schreckenghost et al., 2002), (Bonasso et al., 2003)).
This was because in most applications, a domain theory
complete enough to make use of generative planning or
sophisticated scheduling was not available. However, the
set of procedures provided by users, together with the
sensing and acting skills for the hardware was usually
sufficient to achieve successful intelligent control.
More Correct Descriptions of Human Cognition
Reference Problems
(define-rap (move-to ?agent)
(succeed (found ?agent))
(method use-camera
(context (and (available ?sensor)
(is-a ?sensor camera)
(class-of ?agent human))
(task-net
(sequence
(t1 (scan-for-human
?agent ? sensor => ?loc)
(t2 (track-human ?loc))
(t3 (move ?loc)
(wait-for (at-loc ?loc ?epsilon)
:succeed (found ?agent))
(wait-for (lost-track) :fail))))))
(method human-located
...
A newer executive, called Apex, inspired by RAPs,
focuses on achieving human-level task management (Freed
et al., 2003) in order to support evaluation and design of
human-machine systems (Freed et al., 2003), and analysis
of how newly introduced technologies will affect human
operators. Equally, it is used to provide human-like task
management and control competence to robotic agents. An
example procedure in the Apex Procedure Description
Language (PDL) is shown below:
(procedure
(index (shift manual-transmission to ?gear
using ?hand))
(profile ?hand)
(step s1 (grasp stick with ?hand))
(step s2 (determine-target-gear-position
?gear => ?position))
(step s3 (move ?hand to ?position)
(waitfor ?s1 ?s2))
(step s4 (terminate) (waitfor ?s3)))
There are a number of reasons we feel that solutions to AI
problems at the executive level are likely to be more
correct descriptions of human cognition than solutions
obtained with more deliberative AI mechanisms. The first
is that the executives we have described rely on a large
With both Apex and RAPs, simulations of reference
problems are available, either from application domains or
in the case of Apex, provided as part of the software
system (http://human-factors.arc.nasa.gov/apex). We’ve
used these simulations to develop a dialog management
system (Bonasso, 2002), and to study human interaction
with a variety of systems ranging from air traffic control
work stations (Freed and Remington, 1998) to automatic
teller machines (Freed et al., 2003).
As well, researchers can develop simulations to support
their own research and then test cognitive theories against
those simulations using the execution framework. In
research we are doing on advanced monitoring, we are
Figure 2 A Remotely Operated vehicle Exploring a
Black Smoker
positing a representative problem domain – robotic
platform, environment and task scenario -- that we believe
will force us to examine key issues in the monitoring area.
In particular, we are defining a two-armed, stereo vision,
underwater rover, designed to search for deep ocean
thermal vents (black smokers – see Figure 2), take
chemical samples and temperature profiles, and report
findings to the surface. Using Apex tools, we are building a
simulation for this task scenario in which to test Apex
models of advanced monitoring capabilities for the next
generation of intelligent robot control.
We recommend both cognitive scientists and AI
roboticists investigate this research paradigm in order more
closely tie together the results of both fields.
References
Agre, P. 1988. The Dynamic Structure of Everyday Life.
MIT AI Laboratory, MIT.
Bonasso, R. P. 2002. Using Discourse to Modify Control
Procedures. Papers from the AAAI Fall Symposium
on Human Robot Interaction AAAI Press, Tech
Report FS-02-03: 9-14.
Bonasso, R. P., Firby, J. R., Gat, E., Kortenkamp, D.,
Miller, D. P., and Slack, M. G. 1997a. Experiences
with an Architecture for Intelligent, Reactive
Agents. Journal of Experimental and Theoretical
Artificial Intelligence 9: 237-256.
Bonasso, R. P., Kortenkamp, D., and Thronesbery, C.
2003. Intelligent Control of a Water Recovery
System: Three Years In The Trenches. Artificial
Intelligence 24 (1): 19-44.
Bonasso, R. P., Kortenkamp, D., and Whitney, T. 1997b.
Using a Robot Control Architecture to Automate
Shuttle Procedures. In Proceedings of Ninth
Conference on Innovative Applications of Artificial
Intelligence (IAAI), 949-956. Providence, RI.
Firby, J. R. 1989. Adaptive Execution In Complex
Dynamic Worlds. Doctoral, Computer Science,
Yale.
Firby, J. R. 1999. The RAPS Language Manual, Neodesic,
Inc., Chicago.
Fitzgerald, W. and Firby, J. R. 1998. The Dynamic
Predictive Memory Architecture: Integrating
Language with Task Execution. In Proceedings of
Proceedings of IEEE Symposia on Intelligence and
Systems. Washington, DC.
Freed, M., Harris, R., and Shafto, M. 2004. Human
Interaction Challenges is UAV-based autonomous
surveillance. In Proceedings of the AAAI Spring
Symposium. Stanford, CA.
Freed, M., Matessa, M., Remington, R., and Vera, A. 2003.
How Apex Automates CPM-GOMS. In Proceedings
of 5th International Conference on Cognitive
Modeling. Bamburg, Germany.
Freed, M. and Remington, R. 1998. A conceptual
framework for predicting errors in complex humanmachine environments. In Proceedings of
Proceedings of1998 Meeting of the Cognitive
Science Society. Madison, Wisconsin.
Gat, E. 1998. Three-Layer Architectures. In Mobile Robots
and Artificial Intelligence, Kortenkamp, D.,
Bonasso, R. P., and Murphy, R., Eds. Menlo Park,
CA: AAAI Press, 195-210.
Murphy, R. 2000. Introduction to AI Robotics. Cambridge,
MA: MIT Press.
Newell, A. 1990. Unified Theories of Cognition.
Cambridge, MA: Harvard University Press.
Schreckenghost, D., Thronesbery, C., Bonasso, R. P.,
Kortenkamp, D., and Martin, C. E. 2002. Intelligent
Control of Life Support for Space Missions. IEEE
Intelligent Systems 17 (5): 24-31.
Download