Outline What is an Agent? Overview of the OAA Implementation OAA-based Applications Related Work Summary The Open Agent ArchitectureTM Building communities of distributed software agents Adam Cheyer David Martin Douglas Moran Artificial Intelligence Center SRI International 333 Ravenswood Avenue Menlo Park CA 94025 http://www.ai.sri.com/~oaa SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Examples Voyager, Aglets, Odyssey Robots, Softbots FireFly, MIT Media Lab Microsoft Agent, Julia ModSAF, RoboCup OAA, KQML, FIPA What is an Agent? Mobile Agents Programs that move among computer hosts Autonomous Agents Based on planning technologies Learning Agents User preferences, collaborative filtering,... Animated Interface Agents Avatars, chatbots, ... Simulation-based Entities Cooperative Agents Collaboration among distributed heterogeneous components SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Overview of the OAA Definition OAA: A framework for integrating a community of software agents in a distributed environment Distributed Computing Through Delegation What, not how or who User Interface SRI International, AI Center Facilitates flexible, adaptable interactions among distributed components through delegation of tasks, data requests & triggers Enables natural, mobile, multimodal user interfaces to distributed services Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Approaches to Building Applications OAA’s Objective Monolithic Applications Object-Oriented Applications Distributed Object Applications Virtual community of dynamic services Adaptable to changing, evolving network resources Flexible interactions among components SRI International, AI Center Dynamic addition OAA Applications Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Adaptable Interfaces PlatformIndependent Multimodal User Interfaces SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Agent Types OAA Architecture User Interface Agents accept multimodal input and present results Facilitator Agent Natural Language Agents produce requests in ICL Facilitator Agents receive ICL requests and coordinate multiagent execution App Agents wrap legacy applications Registry Interagent Communication Language User Interface Agent NL to ICL Agent Application Agent Meta Agent API Meta Agents apply domain knowledge to help coordinate other agents Modality Agents SRI International, AI Center Application Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Interagent Communication Language (ICL) ICL: unified means of expressing all agent functionalities Using ICL, agents: - register capability specifications - request services of community: Perform queries, execute actions, exchange information, set triggers, manipulate data ICL defines both conversation layer of requests & logic-based content layer ICL delegation: description of request + advice & constraints ICL is platformindependent Support for programming languages C, C++, Visual Basic, Java, Delphi, Prolog, Lisp SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Delegation through ICL Task Management oaa_Solve(TaskExpr, ParamList) Expressions: logic-based (cf. Prolog) Parameters: provide advice & constraints • High-level task types: query, action, inform, ... • Low-level: solution_limit(N), time_limit(T), parallel_ok(TF), priority(P), address(Agt), reply(Mode), block(TF), collect(Mode), ... Data & Trigger Management oaa_AddData(DataExpr, ParamList) oaa_AddTrigger(Typ,Cond,Action,Ps) Example oaa_Solve((manager(‘John Bear’,M), phone_number(M,P)), [query(var(P))]) SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Multimodal User Interfaces User is special member of agent community User interfaces to distributed services, using distributed services Natural language translation to and from ICL • Multiple NL agents for different qualities (fast, robust) and languages (English, French) Multiagent cooperation for ambiguity resolution • Pen: gesture or handwriting? • Reference resolution: “photo of the hotel” - NL Agent: hotel in language context - Gesture Agent: hotel being pointed at - UI Agent: only one hotel visible - Database Agent: “hotel on Smith Street” - Discourse Agent: “the other hotel” - Human User: if still ambiguous, can clarify • Cross-modality ambiguities - Arrow + “scroll map” vs. Arrow + “show hotel” SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa OAA Triggers Purpose Adding a Trigger Trigger Types Actions OAA agents can dynamically register interest in any data change, communication event, or realworld occurrence accessible by any agent. oaa_AddTrigger(Type, Cond, Action, Params) comm: on_send, on_receive message time: “in ten minutes”, “every day at 5pm” data: on_change, on_remove, on_add task: “when mail arrives about...” The actions of triggers may be any ICL expression solvable by the community of agents SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa A Sample Text-to-Speech Agent in C Include libraries List capabilities Define capabilities Agent Startup #include <libcom_tcp.h> #include <liboaa.h> ICLTerm capabilities = icl_TermFromStr(“[play(tts, Msg)]”); ICLTerm oaa_AppDoEvent(ICLTerm Event, ICLTerm Params) { if (strcmp(icl_Str(Event), “play”) == 0) { return playTTS(icl_ArgumentAsStr(Event, 2)); } else return NULL; } main() { com_Connect(“parent”, connectionInfo); oaa_Register(“parent”, “tts”, capabilities); oaa_MainLoop(True); } SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa A Sample Text-to-Speech Agent in Prolog Include libraries List capabilities Define capabilities Agent Startup :- [libcom_tcp]. :- [liboaa]. capabilities([solvable(play(tts, Msg), [type(procedure), callback(tts_events)], [])]). tts_events(play(tts, Msg), Params) :tts_api(Msg). start :capabilities(C), com_Connect(parent, ConnectionInfo), oaa_Register(parent, tts, C), oaa_MainLoop(true). SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa 1. 2. 3. 4. 5. 6. Automated Office Unified Messaging Multimodal Maps CommandTalk ATIS-Web Spoken Dialog Summarization 7. Agent Development Tools 8. InfoBroker 9. Rental Finder 10. InfoWiz Kiosk 11. Multi-Robot Control 12. MVIEWS Video Tools 13. MARVEL 14. SOLVIT 15. Surgical Training 16. Instant Collaboration 17.Crisis Response 18. WebGrader 19. Speech Translation 20-25+ ... SRI International, AI Center OAA-based Applications Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Main Points Automated Office Application Mobile access to distributed services Legacy applications interacting with AI technologies High-level tasking of agents through NL and speech Flexible interactions among components Delegated Triggers SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Main Points Multimodal Maps Application Natural interface to distributed (web) data Synergistic combination of handwriting, drawing, speech, direct manipulation Parallel cooperation and competition among many agents Human & Agent collaboration SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Main Points Unified Messaging Mobile, adaptable access to distributed services Integrated Messaging: web, email, voice, fax Distributed reference resolution and media format translation Flexible interactions among components Delegated Triggers SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Main Points MVIEWS Application Live and Live and Video Archived Archived Video Multimodal annotation of video using speech & pen Automated detection, tracking, and geolocation of moving objects Interactive Interactive Map Map Search and replay of videos indexed by multimodal and auxilliary data Applications: multi-sensor surveillance, Predator UAV, Olympic bombing SRI International, AI Center Video browser with multimedia timeline Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Main Points InfoWiz Application An information kiosk with an animated wizard who : answers questions, gives tours, and helps navigate the information space OAA integrates SRI’s speech recognition, NL, and knowledge representation with Microsoft Agent graphics and Netscape’s webbrowser Soon in SRI ’s lobby SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa CommandTalk Application A spoken language interface to the LeatherNet military simulation and training system Main Points Spoken language interface adapts to dynamic changes in simulated world Advantages of speech: - More realistic training - Faster, more natural interface Supports Army, Navy, Marine Corp and Airforce versions of ModSAF simulator SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Agent Development Tools Tools are implemented themselves in OAA Guide user through process of creating an agent: • Definition of capabilities • Documentation management (publication on Web) • Code generation of agent template • Definition of NL vocabulary • Update NL & speech recognition systems • Assembly of multiagent projects Runtime tool for launching and monitoring agent communities SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa Related Work Distributed objects (CORBA, DCOM) + + + - Object-based integration of heterogeneous components Network services (e.g. security, transactions) Commercial implementations exist (e.g. Iona,Visigenic) Interactions primarily hard-coded (method calls) Agent Communication Languages (KQML, FIPA) + Asynchronous message-passing communication richer than object model. Facilitates parallelism +/- Communication acts separate from content (KIF, SL) - Interactions primarily hard-coded (peer-to-peer msgs) OAA focuses on providing delegation services for flexible interactions on tasks, triggers and data mgmt + Research applicable to both DOBJ and ACL models + Bridges can be built from and to other models + OAA concepts could be layered on top of other models SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa OAA vs. Distributed Objects (CORBA, DCOM) • Distributed, heterogeneous • Retrieve obj, call obj – interface: C++ -like – hardcoded interactions • Distributed, heterogeneous • Ask Facilitator to call service + interface: declarative specs + delegated goal & advice • parallel, compound goals, backtracking, constraints • Data & Trigger management SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa OAA vs. Agent Communication Languages (KQML,FIPA) • Distributed, heterogeneous • Distributed, heterogeneous • Ask Agent Name Server • Ask Facilitator to distribute and or Service Broker for Addr, coordinate complex requests send msg, handle reply + parallel, compound goals, – hardcoded interactions backtracking, constraints +/- conversation policies + tasks, triggers, data mgmt • Logic-based content (KIF,SL) • Logic-based content (ICL) SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa OAA and Scalability Limitations: Facilitator is single point of failure Facilitator is bottleneck for communication Solutions? Multi-Facilitator topologies Distribution of planning & execution functions of Facilitator + peer-to-peer communication Facilitator Facilitator Facilitator Replicated Plan + Exe Registry & Planner Agent E SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa OAA Characteristics Open: Extensible: Distributed: Parallel: Mobile: High-level: Multimodal: agents can be created in many languages and interface with existing systems agents can be added or replaced dynamically agents are spread across many computers Parallel execution of subtasks Lightweight interfaces on phone and/or PDA hides software and hardware dependencies handwriting, speech, gestures, and direct manipulation can be combined together SRI International, AI Center Open Agent ArchitectureTM http://www.ai.sri.com/~oaa