RavenClaw: Dialog Management Using Hierarchical Task Decomposition and an Expectation Agenda ø. Abstract

advertisement
RavenClaw: Dialog Management Using Hierarchical
Task Decomposition and an Expectation Agenda
Dan Bohus
Alex Rudnicky
School of Computer Science,
Carnegie Mellon University,
Pittsburgh, PA, 15213
ø. Abstract
RavenClaw is a new dialog management framework developed as the successor to the Agenda architecture used in the CMU
Communicator. RavenClaw introduces a clear separation between the specification of task and discourse behaviors, and
allows rapid development of dialog management components for spoken dialog systems operating in complex, taskoriented domains. The new system development effort is focused entirely on the specification of the dialog task, while a rich set of
domain-independent conversational behaviors are transparently generated by the dialog engine. To date, RavenClaw has been
applied to five different domains allowing us to draw some preliminary conclusions as to the generality of the approach. We briefly
describe our experience in developing these systems.
1.
2. Overall design
Goals
RavenClaw = framework aimed at the rapid development
of dialog managers for complex, task-oriented dialog
domains
 Handle a variety of complex domains
 Easy to develop and maintain systems
 Developer focuses only on specifying the dialog task
 Dialog engine handles the rest automatically
 Architecture supports:
 Learning (both task and discourse levels)
 Dynamic generation of dialog tasks
 Grounding mechanisms
RavenClaw is a 2-tier architecture (see below)
Dialog Task Specification Layer
 Captures all the domain-specific dialog (task) logic
 The system development effort is entirely focused here
Domain-independent Dialog Engine
 Manages dialog by executing the Dialog Task Specification
 Provides domain-independent conversational strategies
Fig Key architectural details
Dialog Task Specification (sample)
DEFINE_AGENCY(CLogin,
IS_MAIN_TOPIC()
DEFINE_SUBAGENTS(
SUBAGENT(Welcome, CWelcome)
SUBAGENT(AskRegistered, CAskRegistered)
SUBAGENT(AskName, CAskName)
SUBAGENT(GreetUser, CGreetUser)
)
DEFINE_CONCEPTS(
STRING_USER_CONCEPT(user_name)
BOOL_USER_CONCEPT(registered)
)
SUCCEEDS_WHEN(COMPLETED(GreetUser))
PROMPT_ESTABLISH_CONTEXT(“establish_context login”)
)
RoomLine
DEFINE_INFORM_AGENT(CWelcome,
PROMPT(“:non-interruptable inform welcome”)
)
user_name
registered
DEFINE_REQUEST_AGENT(CAskRegistered,
REQUEST_CONCEPT(registered)
GRAMMAR_MAPPING(“[Yes]>true, [No]>false”)
)
DEFINE_REQUEST_AGENT(CAskName,
PRECONDITION(IS_TRUE(registered))
REQUEST_CONCEPT(user_name)
MAX_ATTEMPTS(2)
GRAMMAR_MAPPING(“[UserName]”)
)
Suspend
query
Login
Welcome
...
GreetUser
DateTime
results
GetQuery
GetResults
Location
Properties
DiscussResults
Rich concept representation
AskName
Network
Projector
Whiteboard
Joe Down / 0.33
AskRegistered
John Doe / 0.46
 Set of confidence / value pairs
 History of previous values
 Flags indicating grounding,
availability, conveyance
status, etc
Dialog Task
Specification
Dialog Engine
Dialog Stack / Agents Execution
1
2
User Input:
3
Welcome
RoomLine
Expectation Agenda
Login
Login
RoomLine
RoomLine
4
5
AskRegistered
Login
Login
RoomLine
RoomLine
registered: [No] → false, [Yes] → true
System: Are you a registered user?
registered: [No] → false, [Yes] → true
user_name: [UserName]
User:
Yes, this is John Doe
Parse:
[Yes](yes / 0.87)
[UserName](john doe / 0.46)
registered: [No] → false, [Yes] → true
user_name: [UserName]
query.date_time: [DateTime]
query.location: [Location]
query.network: [Network]
query.projector: [Projector]
query.whiteboard: [Whiteboard]
2.
Conversational behaviors
The Dialog Task Specification
The Dialog Engine automatically provides a basic set
of domain-independent conversational behaviors
Generics
The Dialog Task Specification = tree of dialog agents,
with each agent handling the corresponding part of
the dialog task
Advantages of hierarchical representation:
 Dialog task structure naturally lends itself to hierarchical
description
 Ease of maintenance and design; good scalability
 Implicitly captures context in dialog
 Generic dialog mechanisms
 Help, Repeat, Suspend, Start over, etc
 Turn-taking behavior
 Grounding behaviors
 Explicit and implicit verifications, disambiguations,
context reestablishment, etc
4.
Dialog Task Agents
LARRI [Symphony Project, CMU]
Fundamental Dialog Agents (on leaves)
 Inform – sends an output
 Request – requests and listens for information
 Expect – expects (listens for) information
 DomainOperation – performs domain operations (i.e.
back-end calls, etc)
A multi-modal conversational agent that provides
support for F/A-18 aircraft mechanics performing
maintenance tasks:
 Guidance & information browsing domain
 Tree-based decomposition very well suited in this
domain; portions of the dialog task tree are generated
dynamically based on the task to be performed
Dialog Agencies (non-terminal nodes)
 Control the execution of the subsumed agents
Intelligent Procedure Assistant
[NASA Ames]
Agent properties / functionalities:
 Execute routine
 Preconditions and triggers
 Completion criteria (successful / unsuccessful)
 Effects
 Hold concepts
3.
RavenClaw-based systems
Multi-modal system that provides assistance to
astronauts on the International Space Station in the
execution of procedural tasks and checklists:
 Guidance & information browsing domain
 RavenClaw interfaced in Open Agent Architecture (with
Gemini inputs / output)
The Dialog Engine
BusLine [Let’s Go! Project, CMU]
Domain-independent component that executes the Dialog
Task Specification
Dialog flow is generated by alternating Execution Phases
and Input Phases
Information search interface to Pittsburgh bus
schedules:
 Information exploration domain
 Static dialog task tree
Execution Phase
RoomLine [CMU]
Assistance for conference room reservation and
scheduling within the School of Computer Science at
CMU:
 Information management domain
 Static dialog task tree
The dialog agents in the task tree are executed and
generate the system’s behavior.
Dialog engine uses a stack structure to execute the agents
in the task tree:
 Repeatedly execute agent on top of the stack
 When agencies execute, they plan one of their
subsumed agents for execution (according to
preconditions and policies)
 Completed agents are removed from the stack
 Request-type fundamental agents can interrupt an
Execution Phase and solicit an Input Phase
(3-Stage) Input Phase
1. Assemble an Expectation Agenda
Expectation Agenda models the system’s input
expectation at that point in time
2. Bind values from input to concepts
Inputs are matched to system expectations
3. Analyze focus shifts
Establish if the focus of the conversation should be
shifted in light of the recent input
… then, continue with another Execution Phase.
TeamTalk [11-741, CMU]
Spoken command and control for a team of robots:
 Command and control domain
 Challenges: multi-way conversations, (complex)
asynchronous behaviors
 Static dialog task tree
5.
Conclusions
 RavenClaw = Dialog Management framework which
focuses system development effort on creating a
description of the underlying dialog task
 Dialog Engine drives the dialog towards its goals, and uses
generic conversational strategies to maintain dialog flow
and coherence
 5 systems built to date spanning various domains and task
complexities
 RavenClaw adapted easily, indicating high versatility and
good scalability properties
School of Computer Science,
Carnegie Mellon University, 2003,
Pittsburgh, PA, 15213.
Download