An Adaptive Architecture for Physical Agents Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA http://cll.stanford.edu/ Thanks to D. Choi, K. Cummings, N. Nejati, S. Rogers, S. Sage, D. Shapiro, and J. Xuan for their contributions. This talk reports research. funded by grants from DARPA IPTO and the US National Science Foundation, which are not responsible for its contents. General Cognitive Systems The original goal of artificial intelligence was to design and implement computational artifacts that: combined many cognitive abilities in an integrated system; exhibited the same level of intelligence as humans; utilized its intelligence in a general way across domains. Instead, modern AI has divided into many subfields that care little about systems, generality, or even intelligence. But the challenge remains and we need far more research on general cognitive systems. The Domain of In-City Driving Consider driving a vehicle in a city, which requires: selecting routes obeying traffic lights avoiding collisions being polite to others finding addresses staying in the lane parking safely stopping for pedestrians following other vehicles delivering packages These tasks range from low-level execution to high-level reasoning. The Fragmentation of AI Research language planning perception reasoning action learning Newell’s Vision In 1973, Allen Newell argued “You can’t play twenty questions with nature and win”. Instead, he proposed that we: move beyond isolated phenomena and capabilities to develop complete models of intelligent behavior; demonstrate our systems’ intelligence on the same range of domains and tasks as humans can handle; evaluate these systems in terms of generality and flexibility rather than success on a single application domain. However, there are different paths toward achieving these goals. An Architecture with Communicating Modules language planning perception reasoning action learning software engineering / multi-agent systems An Architecture with Shared Short-Term Memory language planning perception short-term beliefs and goals reasoning action learning blackboard architectures Architectures and Constraints Newell’s vision for research on theories of intelligence was that: agent architectures should make strong theoretical assumptions about the nature of the mind. architectural designs should change only gradually, as new structures or processes are determined necessary. later design choices should be constrained heavily by earlier ones, not made independently. A successful architecture is all about mutual constraints, and it should provide a unified theory of intelligent behavior. He associated these aims with the idea of a cognitive architecture. An Architecture with Shared Long-Term Memory language perception planning short-term beliefs and goals long-term memory structures action learning cognitive architectures reasoning A Constrained Cognitive Architecture language perception planning short-term beliefs and goals long-term memory structures action learning reasoning The ICARUS Architecture In this talk I will use one such framework ICARUS to illustrate the advantages of cognitive architectures. Like previous candidates, it incorporates ideas from theories of human problem solving and reasoning. However, ICARUS is also distinctive in its concern with: physical agents that operate in an external environment; the hierarchical structure of knowledge and its origin. These concerns have led to different assumptions than earlier cognitive architectures like ACT-R, Soar, and Prodigy. Theoretical Claims of ICARUS Our recent work on ICARUS has been guided by six principles: 1. Cognition grounded in perception and action 2. Cognitive separation of categories and skills 3. Hierarchical organization of long-term memory 4. Cumulative learning of skill/concept hierarchies 5. Correspondence of long-term/short-term structures 6. Modulation of symbolic structures with numeric content These ideas distinguish ICARUS from most other architectures. Architectural Commitment to Memories A cognitive architecture makes a specific commitment to: long-term memories that store knowledge and procedures; short-term memories that store beliefs and goals; sensori-motor memories that hold percepts and actions. Each memory is responsible for different content that the agent uses in its activities. ICARUS’ Memories Perceptual Buffer Long-Term Conceptual Memory Short-Term Conceptual Memory Environment Long-Term Skill Memory Short-Term Goal/Skill Memory Motor Buffer Architectural Commitment to Representations For each memory, a cognitive architecture also commits to: the encoding of contents in that memory; the organization of structures within the memory; the connections among structures across memories. Most cognitive architectures rely upon formalisms similar to predicate calculus that express relational content. These build on the central assumption of AI that intelligence involves the manipulation of list structures. ICARUS’ Percepts are Objects with Attributes (self me speed 24.0 wheel-angle 0.02 limit 25.0 road-angle 0.06) (segment g1059 street 2 dist -5.0 latdist 15.0) (segment g1050 street A dist -45.0 latdist nil) (segment g1049 street A dist oor latdist nil) (lane-line g1073 length 100.0 width 0.5 dist 35.0 angle 1.57 color white) (lane-line g1074 length 100.0 width 0.5 dist 15.0 angle 1.57 color white) (lane-line g1072 length 100.0 width 0.5 dist 25.0 angle 1.57 color yellow) (lane-line g1100 length 100.0 width 0.5 dist -15.0 angle 0.0 color white) (lane-line g1101 length 100.0 width 0.5 dist 5.0 angle 0.0 color white) (lane-line g1099 length 100.0 width 0.5 dist -5.0 angle 0.0 color yellow) (lane-line g1104 length 100.0 width 0.5 dist 5.0 angle 0.0 color white) (intersection g1021 street A cross 2 dist -5.0 latdist nil) (building g943 address 246 c1dist 43.69 c1angle -0.73 c2dist nil c2angle nil) (building g941 address 246 c1dist 30.10 c1angle -1.30 c2dist 43.70 c2angle -0.73) (building g939 address 197 c1dist 30.10 c1angle -1.30 c2dist 33.40 c2angle -2.10) (building g943 address 172 c1dist 33.40 c1angle -2.09 c2dist 50.39 c2angle -2.53) (sidewalk g975 dist 15.0 angle 0.0) (sidewalk g978 dist 5.0 angle 1.57) ICARUS’ Beliefs are Relations Among Objects (not-on-street me g2980) (not-approaching-cross-street me g2980) (current-street me A) (not-delivered g2980) (in-leftmost-lane me g2533) (fast-for-right-turn me) (driving-in-segment me g2480 g2533) (steering-wheel-straight me) (aligned-with-lane me g2533) (on-right-side-of-road me) (buildings-on-right me g2231 g2230 g2480) (buildings-on-right me g2231 g2222 g2480) (buildings-on-right me g2231 g2211 g2480) (buildings-on-right me g2230 g2222 g2480) (buildings-on-right me g2230 g2211 g2480) (buildings-on-right me g2222 g2211 g2480) (buildings-on-left me g2366 g2480) (buildings-on-left me g2370 g2480) (currrent-building me g2222) (not-on-cross-street me g2980) (current-segment me g2480) (in-U-turn-lane me g2533) (lane-to-right me g2533) (fast-for-U-turn me) (at-speed-for-cruise me) (centered-in-lane me g2533) (in-lane me g2533) (in-segment me g2480) (increasing me g2231 g2230 g2480) (increasing me g2231 g2222 g2480) (increasing me g2231 g2211 g2480) (increasing me g2230 g2222 g2480) (increasing me g2230 g2211 g2480) (increasing me g2222 g2211 g2480) (buildings-on-left me g2368 g2480) (buildings-on-left me g2372 g2480) Teleoreactive Logic Programs ICARUS encodes long-term knowledge of three general types: Concepts: A set of conjunctive relational inference rules; Primitive skills: A set of durative STRIPS operators; Nonprimitive skills: A set of clauses which specify: a head that indicates a goal the method achieves; a set of (possibly defined) preconditions; one or more ordered subskills for achieving the goal. These teleoreactive logic programs can be executed reactively but in a goal-directed manner (Nilsson, 1994). ICARUS Concepts for In-City Driving (in-segment (?self ?sg) :percepts ((self ?self segment ?sg) (segment ?sg))) (aligned-with-lane (?self ?lane) :percepts ((self ?self) (lane-line ?lane angle ?angle)) :positives ((in-lane ?self ?lane)) :tests ((> ?angle -0.05) (< ?angle 0.05)) ) (on-street (?self ?packet) :percepts ((self ?self) (packet ?packet street ?street) (segment ?sg street ?street)) :positives ((not-delivered ?packet) (current-segment ?self ?sg)) ) (increasing-direction (?self) :percepts ((self ?self)) :positives ((increasing ?b1 ?b2)) :negatives ((decreasing ?b3 ?b4)) ) ICARUS Skills for In-City Driving (on-street-right-direction (?self ?packet) :percepts ((self ?self segment ?segment direction ?dir) (building ?landmark)) :start ((on-street-wrong-direction ?self ?packet)) :ordered ((get-in-U-turn-lane ?self) (prepare-for-U-turn ?self) (steer-for-U-turn ?self ?landmark)) ) (get-aligned-in-segment (?self ?sg) :percepts ((lane-line ?lane angle ?angle)) :requires ((in-lane ?self ?lane)) :effects ((aligned-with-lane ?self ?lane)) :actions ((steer (times ?angle 2))) ) (steer-for-right-turn (?self ?int ?endsg) :percepts ((self ?self speed ?speed) (intersection ?int cross ?cross) (segment ?endsg street ?cross angle ?angle)) :start ((ready-for-right-turn ?self ?int)) :effects ((in-segment ?self ?endsg)) :actions ((times steer 2)) ) Hierarchical Structure of Long-Term Memory ICARUS organizes both concepts and skills in a hierarchical manner. concepts Each concept is defined in terms of other concepts and/or percepts. skills Each skill is defined in terms of other skills, concepts, and percepts. Hierarchical Structure of Long-Term Memory ICARUS interleaves its long-term memories for concepts and skills. concepts skills For example, the skill highlighted here refers directly to the highlighted concepts. Architectural Commitment to Processes In addition, a cognitive architecture makes commitments about: performance processes for: retrieval, matching, and selection inference and problem solving perception and motor control learning processes that: generate new long-term knowledge structures refine and modulate existing structures In most cognitive architectures, performance and learning are tightly intertwined. ICARUS’ Functional Processes Perceptual Buffer Long-Term Conceptual Memory Long-Term Skill Memory Conceptual Inference Problem Solving Skill Learning Short-Term Conceptual Memory Perception Skill Retrieval Environment Short-Term Goal/Skill Memory Skill Execution Motor Buffer The ICARUS Control Cycle On each successive execution cycle, the ICARUS architecture: 1. places descriptions of sensed objects in the perceptual buffer; 2. infers instances of concepts implied by the current situation; 3. finds paths through the skill hierarchy from top-level goals; 4. selects one or more applicable skill paths for execution; 5. invokes the actions associated with each selected path. Thus, ICARUS agents are examples of what Nilsson (1994) refers to as teleoreactive systems. Basic ICARUS Processes ICARUS matches patterns to recognize concepts and select skills. concepts Concepts are matched bottom up, starting from percepts. skills Skill paths are matched top down, starting from intentions. ICARUS Interleaves Execution and Problem Solving Skill Hierarchy Problem Reactive Execution ? no impasse? Primitive Skills yes Problem Solving Executed plan Interleaving Reactive Control and Problem Solving Solve(G) Push the goal literal G onto the empty goal stack GS. On each cycle, If the top goal G of the goal stack GS is satisfied, Then pop GS. Else if the goal stack GS does not exceed the depth limit, Let S be the skill instances whose heads unify with G. If any applicable skill paths start from an instance in S, Then select one of these paths and execute it. Else let M be the set of primitive skill instances that have not already failed in which G is an effect. If the set M is nonempty, Then select a skill instance Q from M. Push the start condition C of Q onto goal stack GS. Else if G is a complex concept with the unsatisfied subconcepts H and with satisfied subconcepts F, Then if there is a subconcept I in H that has not yet failed, Then push I onto the goal stack GS. Else pop G from the goal stack GS and store information about failure with G's parent. Else pop G from the goal stack GS. Store information about failure with G's parent. This is traditional means-ends analysis, with three exceptions: (1) conjunctive goals must be defined concepts; (2) chaining occurs over both skills/operators and concepts/axioms; and (3) selected skills are executed whenever applicable. A Successful Problem-Solving Trace initial state (clear C) (hand-empty) (unst. C B) (unstack C B) (clear B) (on C B) goal (unst. B A) (ontable A T) (holding C) (putdown C T) (clear A) (unstack B A) (hand-empty) (on B A) (holding B) C B B A A C Architectures as Programming Languages Cognitive architectures come with a programming language that: includes a syntax linked to its representational assumptions inputs long-term knowledge and initial short-term elements provides an interpreter that runs the specified program incorporates tracing facilities to inspect system behavior Such programming languages ease construction and debugging of knowledge-based systems. For this reason, cognitive architectures support far more efficient development of software for intelligent systems. Programming in ICARUS The programming language associated with ICARUS comes with: the syntax of teleoreactive logic programs the ability to load and parse such programs an interpreter for inference, execution, planning, and learning a trace package that displays system behavior over time We have used this language to develop adaptive intelligent agents in a variety of domains. The Origin of Skill Hierarchies ICARUS’ commitment to hierarchical organization raises a serious question about the origin of its structures. We want mechanisms which acquire these structures in ways that: are consistent with knowledge of human behavior; operate in an incremental and cumulative manner; satisfy constraints imposed by other ICARUS components. This requires some source of experience from which to create hierarchical structures. ICARUS Learns Skills from Problem Solving Skill Hierarchy Problem Reactive Execution ? no impasse? Primitive Skills yes Problem Solving Skill Learning Executed plan Three Questions about Skill Learning What is the hierarchical structure of the network? The structure is determined by the subproblems that arise in problem solving, which, because operator conditions and goals are single literals, form a semilattice. What are the heads of the learned clauses/methods? The head of a learned clause is the goal literal that the planner achieved for the subproblem that produced it. What are the conditions on the learned clauses/methods? If the subproblem involved skill chaining, they are the conditions of the first subskill clause. If the subproblem involved concept chaining, they are the subconcepts that held at the outset of the subproblem. Constructing Skills from a Trace (clear C) skill chaining 1 (hand-empty) (unst. C B) (unstack C B) (clear B) (on C B) (unst. B A) (ontable A T) (holding C) (putdown C T) (clear A) (unstack B A) (hand-empty) (on B A) (holding B) C B B A A C Constructing Skills from a Trace (clear C) 1 (hand-empty) (unst. C B) (unstack C B) (clear B) (on C B) (unst. B A) (clear A) (unstack B A) skill chaining 2 (ontable A T) (holding C) (putdown C T) (hand-empty) (on B A) (holding B) C B B A A C Constructing Skills from a Trace concept chaining (clear C) 3 1 (hand-empty) (unst. C B) (unstack C B) (clear B) (on C B) (unst. B A) (clear A) (unstack B A) 2 (ontable A T) (holding C) (putdown C T) (hand-empty) (on B A) (holding B) C B B A A C Constructing Skills from a Trace skill chaining (clear C) 1 (hand-empty) (unst. C B) (unstack C B) 4 3 (clear B) (on C B) (unst. B A) (clear A) (unstack B A) 2 (ontable A T) (holding C) (putdown C T) (hand-empty) (on B A) (holding B) C B B A A C Learned Skills in the Blocks World (clear (?C) :percepts ((block ?D) (block ?C)) :start (unstackable ?D ?C) :skills ((unstack ?D ?C))) (clear (?B) :percepts ((block ?C) (block ?B)) :start [(on ?C ?B) (hand-empty)] :skills ((unstackable ?C ?B) (unstack ?C ?B))) (unstackable (?C ?B) :percepts ((block ?B) (block ?C)) :start [(on ?C ?B) (hand-empty)] :skills ((clear ?C) (hand-empty))) (hand-empty ( ) :percepts ((block ?D) (table ?T1)) :start (putdownable ?D ?T1) :skills ((putdown ?D ?T1))) Skill Clauses Learning for In-City Driving (parked (?ME ?G1152) :percepts ( (lane-line ?G1152) (self ?ME)) :start ( ) :skills ( (in-rightmost-lane ?ME ?G1152) (stopped ?ME)) ) (in-rightmost-lane (?ME ?G1152) :percepts ( (self ?ME) (lane-line ?G1152)) :start ( (last-lane ?G1152)) :skills ( (driving-in-segment ?ME ?G1101 ?G1152)) ) (driving-in-segment (?ME ?G1101 ?G1152) :percepts ( (lane-line ?G1152) (segment ?G1101) (self ?ME)) :start ( (steering-wheel-straight ?ME)) :skills ( (in-lane ?ME ?G1152) (centered-in-lane ?ME ?G1101 ?G1152) (aligned-with-lane-in-segment ?ME ?G1101 ?G1152) (steering-wheel-straight ?ME)) ) Learned Skill for Changing Lanes We have trained the ICARUS driving agent in a cumulative manner. First we present the system with the task of changing lanes using only primitive skills. The architecture calls on problem solving to handle the situation and caches the solution in a new composite skill. This skill later lets the agent change lanes in a reactive fashion. Learned Preparation for Turning After the ICARUS agent has mastered changing lanes, we give it the more complex task of preparing for a turn. Again the system calls on problem solving, but this time it can use the skill already learned. As before, the solution is stored as a new skill, this one calling on the acquired skill for changing lanes. Learned Turning and Parking Next we present the task of turning a corner and parking in the rightmost lane. As before, the system calls the problem solver to handle the situation and stores additional structures in memory. The resulting skill hierarchy can turn and park from the initial position using only reactive control. Intellectual Precursors ICARUS’ design has been influenced by many previous efforts: earlier research on integrated cognitive architectures especially ACT, Soar, and Prodigy earlier frameworks for reactive control of agents research on belief-desire-intention (BDI) architectures planning/execution with hierarchical transition networks work on learning macro-operators and search-control rules previous work on cumulative structure learning However, the framework combines and extends ideas from its various predecessors in novel ways. Directions for Future Research Future work on ICARUS should introduce additional methods for: forward chaining and mental simulation of skills; learning expected utilities from skill execution histories; learning new conceptual structures in addition to skills; probabilistic encoding and matching of Boolean concepts; flexible recognition of skills executed by other agents; extension of short-term memory to store episodic traces. Taken together, these features should make ICARUS a more general and powerful cognitive architecture. Contributions of ICARUS ICARUS is a cognitive architecture for physical agents that: includes separate memories for concepts and skills; organizes both memories in a hierarchical fashion; modulates reactive execution with goal seeking; augments routine behavior with problem solving; and learns hierarchical skills in a cumulative manner. These concerns distinguish ICARUS from other architectures, but it makes commitments along the same design dimensions. Some Other Cognitive Architectures ACT Soar PRODIGY EPIC GIPS RCS CAPS APEX 3T CLARION Dynamic Memory Society of Mind Concluding Remarks We need more research on integrated intelligent systems that: are embedded within a unified cognitive architecture; incorporate modules that provide mutual constraints; demonstrate a wide range of intelligent behavior; are evaluated on multiple tasks in challenging testbeds. If you do not yet use a cognitive architecture in your research on intelligent agents, please consider it seriously. For more information about the ICARUS architecture, see: http://cll.stanford.edu/research/ongoing/icarus/ End of Presentation Aspects of Cognitive Architectures As traditionally defined and utilized, a cognitive architecture: specifies the infrastructure that holds constant over domains, as opposed to knowledge, which varies. models behavior at the level of functional structures and processes, not the knowledge or implementation levels. commits to representations and organizations of knowledge and processes that operate on them. comes with a programming language for encoding knowledge and constructing intelligent systems. Early candidates were cast as production system architectures, but alternatives have gradually expanded the known space. In the Beginning . . . AI monobloc The Big Bang Theory of AI language planning perception reasoning action learning The Task of In-City Driving Transfer Effects in the Blocks World 20 blocks Learning Curves for In-City Driving FreeCell Solitaire FreeCell is a full-information card game that, in most cases, can be solved by planning; it also has a highly recursive structure. Transfer Effects in FreeCell 16 cards An Approach to Hierarchy Learning We have extended ICARUS to incorporate a module for means-ends problem solving that: decomposes complex problems into subproblems; relies on heuristic search to find useful decompositions. When ICARUS cannot execute a skill because its start condition is unmet, this mechanism: chains backward off skills that would achieve the condition; or chains backward off definitions of the unsatisfied concept. Traces of successful problem solving serve as the basis for new hierarchical structures. Evaluation of Intelligent Systems Experimental studies of intelligent systems have lagged behind ones for component methods because: they focus on more complex, multi-step behavior; they require more engineering to develop them; they rely on interaction among their components. Together, these factors have slowed the widespread adoption of experimental evaluation. Repositories for Intelligent Systems Public repositories are now common among the AI subfields, and they offer clear advantages for research by: providing fast and cheap materials for experiments; supporting replication and standards for comparison. However, they can also produce undesirable side effects by: focusing attention on a narrow class of problems; encouraging a ‘bake-off ’ mentality among researchers. To support research on intelligent systems, we need testbeds and environments designed with them in mind. Concluding Remarks We must also think about ways to overcome social obstacles to pursuing this research agenda: conference tracks on integrated systems (e.g., AAAI) testbeds that evaluate general intelligence (e.g., GGP) Both RoboCup and ICDL are in excellent positions to foster more work along these lines. I hope to see increased activity of this type at future meetings of these conferences.