RavenClaw An improved dialog management architecture for task-oriented spoken dialog systems Presented by:

RavenClaw An improved dialog management architecture for task-oriented spoken dialog systems Presented by: Dan Bohus (dbohus@cs.cmu.edu) Work by: Dan Bohus, Alex Rudnicky, Andrew Hoskins Carnegie Mellon University, 2002 New DM Architecture: Goals  Able to handle complex, goal-directed dialogs   Easy to develop and maintain systems      Go beyond (information access systems and) the slot-filling paradigm Developer focuses only on dialog task Automatically ensure a minimum set of taskindependent, conversational skills Open to learning (hopefully both at task and discourse levels) Open to dynamic SDS generation More careful, more structured code, logs, etc: provide a robust basis for future research. 05-22-2002 RavenClaw: a new DM architecture 2 A View from far, far away SELECT * WHERE … Try opening that hatch Since that failed, I need you to push button B Can you repeat that, please ? Suspend… Resume … What did you just say ? Backend Dialog Task Specification Conversational Skills Core  Let the developer focus only on the dialog task spec.:   Don’t worry about misunderstandings, the accuracy of concepts, repeats, focus shifts, barge-ins, etc… merely describe (program) the task, assuming perfect knowledge of the world Automatically generate the conversational mechanisms 05-22-2002 RavenClaw: a new DM architecture 3 Backend Outline    DTS Conversational Goals A view from far away Main ideas   Core Dialog Task Specification / Execution Conversational skills  In more detail  Dialog Task Specification / Execution  Conversational skills 05-22-2002 RavenClaw: a new DM architecture 4 Dialog Task Spec & Execution Communicator Welcome Login Travel Locals Bye DTS AskRegistered AskName GreetUser GetProfile DepartLocation Leg1 ArriveLocation  Dialog Task implemented by a hierarchy of agents  Handle and Operate based on concepts  Execution with interleaved Input Passes.   Execute the agents by top-down “planning” Do input passes when information is required  REMEMBER: This is just the dialog task 05-22-2002 RavenClaw: a new DM architecture 5 Handling inputs Communicator Welcome Login Travel Locals Bye DTS AskRegistered AskName GreetUser GetProfile DepartLocation Leg1 ArriveLocation  Input Pass     Assemble an agenda of expectations (open concepts) Bind values from the input to the concepts Process non-understanding (if), analyze need for focus shifts Continue execution 05-22-2002 RavenClaw: a new DM architecture 6 Conversational Skills / Mechanisms  A lot of problems in SDS generated by lack of conversational skills. “It’s all in the little details!”  Conversational     Dealing with misunderstandings Generic channel/dialog mechanisms : repeats, focus shift, context establishment, help, start over, etc, etc. Timing Even when these mechanisms are in, they lack uniformity & consistency. Development and maintenance are time consuming. 05-22-2002 RavenClaw: a new DM architecture 7 Conversational Skills / Mechanisms  The core takes care of these by dynamically inserting  Conversational appropriate agencies in the task tree A list of (more or less) task independent mechanisms:  Implicit/Explicit Confirmations, Clarifications, Disambiguation = the whole Misunderstandings problem  Context reestablishment  Timeout and Barge-in control  Back-channel absorption  Generic dialog mechanisms:  05-22-2002 Repeat, Suspend… Resume, Help, Start over, Summarize, Undo, Querying the system’s belief RavenClaw: a new DM architecture 8 Outline DTS   Goals A view from far away  Main ideas  Dialog Task Specification / Execution  Conversational skills  In more detail  Dialog Task Specification / Execution  Conversational skills 05-22-2002 RavenClaw: a new DM architecture 9 Dialog Task Specification  Goal: able to handle complex domains, beyond information access, frame-based, slot-filling systems i.e. :   Symphony, Intelligent checklists, Navigation, Route planning We need a powerful enough formalism to describe all these tasks:    C++ code ? Declarative would be nice … but is it powerful enough ? Templatized C++ code … ? 05-22-2002 RavenClaw: a new DM architecture 10 Dialog Task Specification  Tree of predefined agents types:   Each agent has:       Inform, Request, Expect, Execute A set of concepts Preconditions Success Criteria Effects Focus Criteria (triggers) Concepts   Data, Type (basic, struct, array) Confidence/Value, Availability, Ambiguousness, Groundedness, System/User, TurnAcquired, TurnConveyed, etc… 05-22-2002 RavenClaw: a new DM architecture 11 An example DTS UserLogin: AGENCY concepts: registered(BOOL), name(STRING), id(STRING), profile(PROFILE), profile_found(BOOL) achieves_when: profile || InformProfileNotFound AskRegistered: REQUEST(registered) grammar: {[yes]->true,[no]->false,[guest]->false} AskName: REQUEST(name) precond: registered==no grammar: [user_name] max_attemps: 2 InformGreetUser: INFORM precond: name AskID: REQUEST(id) precond: registered==yes mapping: [user_id] DoProfileRetrieval: EXECUTE precond: name || id call: ABEProfile.Call >name, >id, <profile, <profile_found InformProfileNotFound: INFORM precond: !profile_found Given that the baseline is 259 lines of C++ code, this is pretty good. 05-22-2002 RavenClaw: a new DM architecture 12 Can a formalism cut it ?  People have repeatedly tried formalizing dialog … and failed    We’re focusing only on the task (like in robotics/execution) Actually, these agents are all C++ classes, so we can backoff to code; the hope is that most of the behaviors can be easily expressed as above. 05-22-2002 RavenClaw: a new DM architecture 13 DTS execution  Agency.Execute() decides which subagent is executed next, based on preconditions  Various simple policies can be implemented   Left-to-right (open/closed), choice, etc But free to do more sophisticated things (MDPs, etc) ~ learning at the task level 05-22-2002 RavenClaw: a new DM architecture 14 Libraries of DTS agencies ?  Provide a library of “common task” and “common discourse” agencies      Frame agency List browse agency Choose agency Disambiguate agency, Ground Agency, … Etc 05-22-2002 RavenClaw: a new DM architecture 15 Input Pass 1. Construct an agenda of expectations  (Partially?) ordered list of concepts expected by the system [DepartureCity] Co [ArrivalCity] Welcome Regist. Nam Login Greet Travel Prof. Dep Locals Bye [Name] [Registered] [Hotel] [Bye] Leg1 Arr Focused 05-22-2002 RavenClaw: a new DM architecture 16 Input Pass (continued) 2. Bind values/confidences to concepts  The System <> Mixed Initiative spectrum can be expressed in terms of the way the agenda is constructed and binding policies, independent of task [DepartureCity] I’m flying to San Francisco and I need a hotel there. 05-22-2002 RavenClaw: a new DM architecture [ArrivalCity] [Name] [Registered] [Hotel] [Bye] 17 Input pass (continued) 3. Process non-understandings (iff) - try and detect source and inform user:     Channel (SNR, clipping) Decoding (confidence score, prosody) Parsing (parsing scores) Dialog level (parse ok, but no expectation match) 05-22-2002 RavenClaw: a new DM architecture 18 Input Pass 4. Focus shifts   Focus shifts seem to be task dependent. Decision to shift focus is taken by the task (DTS) But they also have a TI-side (sub-dialog size, context reestablishment). Context reestablishment is handled automatically, in the Core (see later) 05-22-2002 RavenClaw: a new DM architecture 19 Outline Conversational   Goals A view from far away  Main ideas Core  Dialog Task Specification / Execution  Conversational skills  In more detail  Dialog Task Specification / Execution  Conversational skills 05-22-2002 RavenClaw: a new DM architecture 20 Task-Independent, Conversational Mechanisms  Should be transparently handled by the core   However, the developer should be able to write his own customized mechanisms if needed Most cases handled by inserting extra “discourse” agents on the fly in the dialog task tree 05-22-2002 RavenClaw: a new DM architecture 21 Conversational Skills: A List  The grounding / misunderstandings problems  Universal dialog mechanisms:     Repeat, Suspend… Resume, Help, Start over, Summarize, Undo, Querying the system’s belief Timing and Barge-in control Focus Shifts, Context Establishment Back-channel absorption  Q: To which extent can we abstract these away from the Dialog Task ? 05-22-2002 RavenClaw: a new DM architecture 22 UDM: Repeat  Repeat (simple)     Repeat (with referents)   The DTT is adorned with a “Repeat” Agency automatically at start-up Which calls upon the OutputManager Not all outputs are “repeatable” (i.e. implicit confirms, gui, )… which ones exactly… ? only 3%, they are mostly [summarize] User-defined custom repeat agency 05-22-2002 RavenClaw: a new DM architecture 23 UDM: Help  DTT adorned at start-up with a help agency  Can capture and issue:   Local help (obtained from focused agent) ExplainMore help (obtained from focused)      What can I say ? Contextual help (obtained from main topic) Generic help (give_me_tips) Obtains Help prompts from the focused agent and the main topic (defaults provided) Default help agency can be overwritten by user 05-22-2002 RavenClaw: a new DM architecture 24 UDM: Suspend … Resume   DTT adorned with a SuspendResume agency. Context reestablishment    Automatically when focusing back after a subdialog Construct a model for that (given size of subdialog, time issues, etc) Prompts problem shifted to the NLG 05-22-2002 RavenClaw: a new DM architecture 25 UDM: Start over, Summarize  Start over:   DTT adorned with a Start-Over agency Summarize:    DTT adorned with a Summarize agency prompt generated automatically problem shifted to NLG … 05-22-2002 RavenClaw: a new DM architecture 26 Timing & barge-in control   Knowledge of barge-in location Information on what got conveyed is fed back to the DM   Special agencies can take special action based on that (I.e. List Browsing) Can we determine what are non-barge-in-able utterances in a task-independent manner ? 05-22-2002 RavenClaw: a new DM architecture 27 Confirmation, Clarif., Disamb., Misunderstandings, Grounding…   Largely unsolved: this is next ! 2 components:  Confidence scores/computation on concepts    Obtaining them Updating them Taking the “right” decision based on those scores:    05-22-2002 Insert appropriate agencies on the fly in the dialog task tree: opportunity for learning What’s the set of decisions / agencies ? How do you decide ? RavenClaw: a new DM architecture 28 Confidence scores   Obtaining conf. Scores: from annotator Updating them, from different sources:       (Un)Attacked implicit/explicit confirms Correction detector Elapsed time ? Domain knowledge Priors ? But how do you integrate all these in a principled way ? 05-22-2002 RavenClaw: a new DM architecture 29 Mechanisms   DepartureCity = <Seattle,0.71><SF,0.29> Implicit / Explicit confirmations    Clarifications   Did you say you were leaving from Seattle ? Disambiguation   When do you leave from Seattle ? So you’re leaving from Seattle… When ? I’m sorry was that Seattle or San Francisco? How do you decide which ?  Learning ? 05-22-2002 RavenClaw: a new DM architecture 30 Software Engineering   Provide a robust basis for future research. Modularity      Separability between task and discourse Separability of concepts and confidence computations Portability Mutiple servers Aggressive, structured, timed logging 05-22-2002 RavenClaw: a new DM architecture 31 Conclusion  New DM framework  separation of dialog task from conversational mechanisms        developer can focus only on dialog task conversational mechanisms generated automatically easier development/maintenance robust platform for future research Most of the implementation completed Symphony/LARRI reimplemented Next: back to misunderstandings ! 05-22-2002 RavenClaw: a new DM architecture 32

RavenClaw An improved dialog management architecture for task-oriented spoken dialog systems Presented by:

Related documents

Products

Support

RavenClaw An improved dialog management architecture for task-oriented spoken dialog systems Presented by:

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib