RavenClaw An improved dialog management architecture for task-oriented spoken dialog systems Presented by:

advertisement
RavenClaw
An improved dialog management architecture for task-oriented
spoken dialog systems
Presented by: Dan Bohus (dbohus@cs.cmu.edu)
Work by: Dan Bohus, Alex Rudnicky, Andrew Hoskins
Carnegie Mellon University, 2002
New DM Architecture: Goals
 Able to handle complex, goal-directed dialogs


Easy to develop and maintain systems





Go beyond (information access systems and)
the slot-filling paradigm
Developer focuses only on dialog task
Automatically ensure a minimum set of taskindependent, conversational skills
Open to learning (hopefully both at task and
discourse levels)
Open to dynamic SDS generation
More careful, more structured code, logs, etc:
provide a robust basis for future research.
05-22-2002
RavenClaw: a new DM architecture
2
A View from far, far away
SELECT * WHERE …
Try opening that hatch
Since that failed, I need you to push
button B
Can you repeat that, please ?
Suspend… Resume …
What did you just say ?
Backend
Dialog Task Specification
Conversational Skills
Core
 Let the developer focus only on the dialog task spec.:


Don’t worry about misunderstandings, the accuracy of
concepts, repeats, focus shifts, barge-ins, etc… merely
describe (program) the task, assuming perfect knowledge of
the world
Automatically generate the conversational mechanisms
05-22-2002
RavenClaw: a new DM architecture
3
Backend
Outline



DTS
Conversational
Goals
A view from far away
Main ideas


Core
Dialog Task Specification / Execution
Conversational skills
 In more detail
 Dialog Task Specification / Execution
 Conversational skills
05-22-2002
RavenClaw: a new DM architecture
4
Dialog Task Spec & Execution
Communicator
Welcome
Login
Travel
Locals
Bye
DTS
AskRegistered
AskName
GreetUser
GetProfile
DepartLocation
Leg1
ArriveLocation
 Dialog Task implemented by a hierarchy of agents

Handle and Operate based on concepts
 Execution with interleaved Input Passes.


Execute the agents by top-down “planning”
Do input passes when information is required
 REMEMBER: This is just the dialog task
05-22-2002
RavenClaw: a new DM architecture
5
Handling inputs
Communicator
Welcome
Login
Travel
Locals
Bye
DTS
AskRegistered
AskName
GreetUser
GetProfile
DepartLocation
Leg1
ArriveLocation
 Input Pass




Assemble an agenda of expectations (open concepts)
Bind values from the input to the concepts
Process non-understanding (if), analyze need for focus shifts
Continue execution
05-22-2002
RavenClaw: a new DM architecture
6
Conversational Skills /
Mechanisms

A lot of problems in SDS generated by lack of
conversational skills. “It’s all in the little details!”

Conversational




Dealing with misunderstandings
Generic channel/dialog mechanisms : repeats, focus
shift, context establishment, help, start over, etc, etc.
Timing
Even when these mechanisms are in, they lack
uniformity & consistency.
Development and maintenance are time
consuming.
05-22-2002
RavenClaw: a new DM architecture
7
Conversational Skills /
Mechanisms
 The core takes care of these by dynamically inserting

Conversational
appropriate agencies in the task tree
A list of (more or less) task independent mechanisms:
 Implicit/Explicit Confirmations, Clarifications,
Disambiguation = the whole Misunderstandings problem
 Context reestablishment
 Timeout and Barge-in control
 Back-channel absorption
 Generic dialog mechanisms:

05-22-2002
Repeat, Suspend… Resume, Help, Start over, Summarize, Undo,
Querying the system’s belief
RavenClaw: a new DM architecture
8
Outline
DTS


Goals
A view from far away
 Main ideas
 Dialog Task Specification / Execution
 Conversational skills
 In more detail
 Dialog Task Specification / Execution
 Conversational skills
05-22-2002
RavenClaw: a new DM architecture
9
Dialog Task Specification

Goal: able to handle complex domains, beyond
information access, frame-based, slot-filling
systems i.e. :


Symphony, Intelligent checklists, Navigation, Route
planning
We need a powerful enough formalism to
describe all these tasks:



C++ code ?
Declarative would be nice … but is it powerful enough ?
Templatized C++ code … ?
05-22-2002
RavenClaw: a new DM architecture
10
Dialog Task Specification

Tree of predefined agents types:


Each agent has:






Inform, Request, Expect, Execute
A set of concepts
Preconditions
Success Criteria
Effects
Focus Criteria (triggers)
Concepts


Data, Type (basic, struct, array)
Confidence/Value, Availability, Ambiguousness,
Groundedness, System/User, TurnAcquired,
TurnConveyed, etc…
05-22-2002
RavenClaw: a new DM architecture
11
An example DTS
UserLogin: AGENCY
concepts: registered(BOOL), name(STRING), id(STRING),
profile(PROFILE), profile_found(BOOL)
achieves_when: profile || InformProfileNotFound
AskRegistered: REQUEST(registered)
grammar: {[yes]->true,[no]->false,[guest]->false}
AskName: REQUEST(name)
precond: registered==no
grammar: [user_name]
max_attemps: 2
InformGreetUser: INFORM
precond: name
AskID: REQUEST(id)
precond: registered==yes
mapping: [user_id]
DoProfileRetrieval: EXECUTE
precond: name || id
call: ABEProfile.Call >name, >id, <profile, <profile_found
InformProfileNotFound: INFORM
precond: !profile_found
Given that the baseline is 259 lines of C++ code, this is pretty good.
05-22-2002
RavenClaw: a new DM architecture
12
Can a formalism cut it ?

People have repeatedly tried formalizing
dialog … and failed 


We’re focusing only on the task (like in
robotics/execution)
Actually, these agents are all C++ classes, so
we can backoff to code; the hope is that most
of the behaviors can be easily expressed as
above.
05-22-2002
RavenClaw: a new DM architecture
13
DTS execution

Agency.Execute() decides which subagent
is executed next, based on preconditions

Various simple policies can be implemented


Left-to-right (open/closed), choice, etc
But free to do more sophisticated things
(MDPs, etc) ~ learning at the task level
05-22-2002
RavenClaw: a new DM architecture
14
Libraries of DTS agencies ?

Provide a library of “common task” and
“common discourse” agencies





Frame agency
List browse agency
Choose agency
Disambiguate agency, Ground Agency, …
Etc
05-22-2002
RavenClaw: a new DM architecture
15
Input Pass
1. Construct an agenda of expectations

(Partially?) ordered list of concepts expected
by the system
[DepartureCity]
Co
[ArrivalCity]
Welcome
Regist.
Nam
Login
Greet
Travel
Prof.
Dep
Locals
Bye
[Name]
[Registered]
[Hotel]
[Bye]
Leg1
Arr
Focused
05-22-2002
RavenClaw: a new DM architecture
16
Input Pass (continued)
2. Bind values/confidences to concepts

The System <> Mixed Initiative spectrum can be
expressed in terms of the way the agenda is constructed
and binding policies, independent of task
[DepartureCity]
I’m flying to San Francisco and
I need a hotel there.
05-22-2002
RavenClaw: a new DM architecture
[ArrivalCity]
[Name]
[Registered]
[Hotel]
[Bye]
17
Input pass (continued)
3. Process non-understandings (iff) - try and
detect source and inform user:




Channel (SNR, clipping)
Decoding (confidence score, prosody)
Parsing (parsing scores)
Dialog level (parse ok, but no expectation
match)
05-22-2002
RavenClaw: a new DM architecture
18
Input Pass
4. Focus shifts


Focus shifts seem to be task dependent.
Decision to shift focus is taken by the task
(DTS)
But they also have a TI-side (sub-dialog size,
context reestablishment). Context
reestablishment is handled automatically, in
the Core (see later)
05-22-2002
RavenClaw: a new DM architecture
19
Outline
Conversational


Goals
A view from far away
 Main ideas
Core
 Dialog Task Specification / Execution
 Conversational skills
 In more detail
 Dialog Task Specification / Execution
 Conversational skills
05-22-2002
RavenClaw: a new DM architecture
20
Task-Independent, Conversational
Mechanisms

Should be transparently handled by the
core


However, the developer should be able to write
his own customized mechanisms if needed
Most cases handled by inserting extra
“discourse” agents on the fly in the dialog
task tree
05-22-2002
RavenClaw: a new DM architecture
21
Conversational Skills: A List
 The grounding / misunderstandings problems
 Universal dialog mechanisms:




Repeat, Suspend… Resume, Help, Start over, Summarize, Undo,
Querying the system’s belief
Timing and Barge-in control
Focus Shifts, Context Establishment
Back-channel absorption
 Q: To which extent can we abstract these away
from the Dialog Task ?
05-22-2002
RavenClaw: a new DM architecture
22
UDM: Repeat

Repeat (simple)




Repeat (with referents)


The DTT is adorned with a “Repeat” Agency
automatically at start-up
Which calls upon the OutputManager
Not all outputs are “repeatable” (i.e. implicit
confirms, gui, )… which ones exactly… ?
only 3%, they are mostly [summarize]
User-defined custom repeat agency
05-22-2002
RavenClaw: a new DM architecture
23
UDM: Help
 DTT adorned at start-up with a help agency
 Can capture and issue:


Local help (obtained from focused agent)
ExplainMore help (obtained from focused)





What can I say ?
Contextual help (obtained from main topic)
Generic help (give_me_tips)
Obtains Help prompts from the focused agent and
the main topic (defaults provided)
Default help agency can be overwritten by user
05-22-2002
RavenClaw: a new DM architecture
24
UDM: Suspend … Resume


DTT adorned with a SuspendResume
agency.
Context reestablishment



Automatically when focusing back after a subdialog
Construct a model for that (given size of subdialog, time issues, etc)
Prompts problem shifted to the NLG
05-22-2002
RavenClaw: a new DM architecture
25
UDM: Start over, Summarize

Start over:


DTT adorned with a Start-Over agency
Summarize:



DTT adorned with a Summarize agency
prompt generated automatically
problem shifted to NLG …
05-22-2002
RavenClaw: a new DM architecture
26
Timing & barge-in control


Knowledge of barge-in location
Information on what got conveyed is fed
back to the DM


Special agencies can take special action
based on that (I.e. List Browsing)
Can we determine what are non-barge-in-able
utterances in a task-independent manner ?
05-22-2002
RavenClaw: a new DM architecture
27
Confirmation, Clarif., Disamb.,
Misunderstandings, Grounding…


Largely unsolved: this is next !
2 components:

Confidence scores/computation on concepts



Obtaining them
Updating them
Taking the “right” decision based on those
scores:



05-22-2002
Insert appropriate agencies on the fly in the dialog
task tree: opportunity for learning
What’s the set of decisions / agencies ?
How do you decide ?
RavenClaw: a new DM architecture
28
Confidence scores


Obtaining conf. Scores: from annotator
Updating them, from different sources:






(Un)Attacked implicit/explicit confirms
Correction detector
Elapsed time ?
Domain knowledge
Priors ?
But how do you integrate all these in a
principled way ?
05-22-2002
RavenClaw: a new DM architecture
29
Mechanisms


DepartureCity = <Seattle,0.71><SF,0.29>
Implicit / Explicit confirmations



Clarifications


Did you say you were leaving from Seattle ?
Disambiguation


When do you leave from Seattle ?
So you’re leaving from Seattle… When ?
I’m sorry was that Seattle or San Francisco?
How do you decide which ?

Learning ?
05-22-2002
RavenClaw: a new DM architecture
30
Software Engineering


Provide a robust basis for future research.
Modularity





Separability between task and discourse
Separability of concepts and confidence
computations
Portability
Mutiple servers
Aggressive, structured, timed logging
05-22-2002
RavenClaw: a new DM architecture
31
Conclusion

New DM framework

separation of dialog task from conversational
mechanisms







developer can focus only on dialog task
conversational mechanisms generated automatically
easier development/maintenance
robust platform for future research
Most of the implementation completed
Symphony/LARRI reimplemented
Next: back to misunderstandings !
05-22-2002
RavenClaw: a new DM architecture
32
Download