Learning the Structure of Task-Oriented Conversations from the Corpus Ananlada Chotimongkol

advertisement
Learning the Structure of Task-Oriented
Conversations from the Corpus
Ananlada Chotimongkol
Language Technologies Institute
School of Computer Science
Carnegie Mellon University
Dialog Reading Group
December 3rd, 2004
Outline


Introduction
Form-based dialog structure



Task structure
Dialog mechanisms
Dialog structure learning



Concept identification and clustering
Form identification
Operation Classification
Dialog Reading Group
December 3rd, 2004
Outline


Introduction
Form-based dialog structure



Task structure
Dialog mechanisms
Dialog structure learning



Concept identification and clustering
Form identification
Operation Classification
Dialog Reading Group
December 3rd, 2004
Building a new dialog system
“When would
you like to
leave?”
“I would like to fly to
Seattle tomorrow.”
Speech
Recognizer
Natural
Language
Understanding
Dialog Reading Group
Domain
Knowledge
Dialog
Manager
Speech
Synthesizer
Natural
Language
Generator
December 3rd, 2004
Domain knowledge

Steps in the task





Important information, keywords


Specify the desired flight
Search for flights that match the criteria
Negotiate the flights
Make a reservation
Destination, date, time, airlines, etc.
Domain language: how do people talk
Dialog Reading Group
December 3rd, 2004
What is the problem?
“When would
you like to
leave?”
“I would like to fly to
Seattle tomorrow.”
Speech
Recognizer
Natural
Language
Understanding
Dialog Reading Group
Domain
Knowledge
Dialog
Manager
• Can’t reuse
• Time consuming
• Speech
May need an expert
Synthesizer
Natural
Language
Generator
December 3rd, 2004
Research goal
Reduce human effort on acquiring
domain knowledge when create a dialog
system in a new domain
By learning the domain knowledge from
data

Dialog Reading Group
December 3rd, 2004
Observations

Task-oriented conversations have a
clear structure


Reflects domain information e.g. a task is
divided into sub-tasks
Has recurring patterns that are observable
through the language
Dialog Reading Group
December 3rd, 2004
The solutions

To learn domain knowledge from data
1. Specify the structure of task-oriented
conversations



Capture sufficient domain knowledge
Domain-independent
Learnable
2. Learn the structure from a corpus of
human-human conversations
Dialog Reading Group
December 3rd, 2004
Dialogue structure

Task Structure (data representation)

Necessary information for achieving a task
goal



Steps in the task
Domain keywords
Dialog mechanism (operations)

The ways that the participants
communicate and perform the task
Dialog Reading Group
December 3rd, 2004
Outline


Introduction
Form-based dialog structure



Task structure
Dialog mechanisms
Dialog structure learning



Concept identification and clustering
Form identification
Operation Classification
Dialog Reading Group
December 3rd, 2004
Existing dialog structures:
Theoretical-oriented

Examples:





Theory of Discourse Structure (Grosz and Sidner,
1986)
Discourse Representation Theory (DRT) (Kamp
and Reyle, 1993)
Focus on developing a theory that helps
interpret discourse meaning
Might be too complex to be implemented in a
dialog system
Use hand-written rules to recognize the
structure
Dialog Reading Group
December 3rd, 2004
Existing dialog structures:
Engineering-oriented

Examples:



Plan-based theory (Allen and Perrault,
1980)
The theory of Conversation Acts (Traum
and Hinkelman, 1992)
Focus on practical issues:


Predictability of each dialog component
The implementation of the structure in a
dialog system
Dialog Reading Group
December 3rd, 2004
What are missing?

Don’t describe key domain information that
the participants communicate in a dialog.


The role of city names in a travel domain
It is not clear how to apply the structure in a
dialog system


The relations between dialog structure
components and dialog system components
How a dialog manager should treat each
component
Dialog Reading Group
December 3rd, 2004
Form-based dialog structure

Describe a dialog structure with an existing
dialog manger frameworks



Have a concrete mapping between dialog
structure components and dialog system
components
A form-based architecture has been used
successfully in many dialog systems
A form-based structure consists of:


A task structure (forms and slots)
Dialogue mechanisms (form operators) that
advance the dialog
Dialog Reading Group
December 3rd, 2004
Outline


Introduction
Form-based dialog structure



Task structure
Dialog mechanisms
Dialog structure learning



Concept identification and clustering
Form identification
Operation Classification
Dialog Reading Group
December 3rd, 2004
Task Structure
3-level of organization
1. Task: a subset of conversations that
has a specific goal
2. Sub-task: a step in a task that
contributes toward a task goal
=> form
3. Concept: key information
=> slot
Dialog Reading Group
December 3rd, 2004
Task Structure:
Bus schedule enquiry domain
1. Task (multiple tasks):


Which bus runs between A and B?
When will the bus X arrive?
2. Sub-tasks: no further decomposition
3. Concepts:


Bus Number={61C, 28X, …}
Location={CMU, airport, …}
Dialog Reading Group
December 3rd, 2004
Departure time query form
F: Query_Departure_Time
Depart_Location: carnegie_mellon
Arrive_Location: the airport
Arrive_Time: Hour: four Minute: thirty
Bus_Number: 28X
Dialog Reading Group
December 3rd, 2004
Task Structure:
Travel planning domain
1. Task: create travel itinerary
2. Sub-tasks:



Flight reservation
Hotel reservation
Car rental reservation
3. Concepts:


airlines={Continental, US-Airways, …}
hotel={Hilton, Marriott, …}
Dialog Reading Group
December 3rd, 2004
Task Structure:
Map reading domain


Task: draw a line (a route)
Sub-tasks:


Draw a segment of a line
Concepts:



Landmark = {white_mountain, Machete, …}
Orientation = {down, left, …}
Distance = {a couple of centimeters, an inch, …}
Dialog Reading Group
December 3rd, 2004
Outline


Introduction
Form-based dialog structure



Task structure
Dialog mechanisms
Dialog structure learning



Concept identification and clustering
Form identification
Operation Classification
Dialog Reading Group
December 3rd, 2004
Dialogue mechanisms


Operations that the participants use to
advance the dialog toward the goal
Task-oriented operations



Manipulate a form (data structure)
Examples: init_form, fill_form
Discourse-oriented operations


Manage the flow of a conversation
Examples: acknowledgement, greeting
Dialog Reading Group
December 3rd, 2004
Dialogue mechanisms (2)

Have a unique consequence on the
state of the conversation


init_form causes a system to create a new
form
Domain independent, only operation
parameters that are different


Fill city_name in flight_information form
Fill bus_number in bus_information form
Dialog Reading Group
December 3rd, 2004
Air travel-planning domain
PT8: request_form_info: WHAT TIME WOULD YOU LIKE TO DEPART DepLoc:[PITTSBURGH ]
X9: fill_form_info: /UM/ EARLY DepT:[MORNING ]NOT BEFORE DepT:[H:[SEVEN ]]
PT10: acknowledge: OKAY
access_DB
inform_result: U.S. AIRWAYS HAS A NON-STOP …
1st leg Form
Dept_Loc: City: PITTSBURGH
Dept_Date: Month: FEBRUARY Date:
TWENTIETH
Dept_Time: EARLY TimeP: MORNING
NOT BEFORE Hour: SEVEN
Flight_ref:
Arr_Loc: City: HOUSTON State: TEXAS
Airport: INTERCONTINENTAL
Arr_Date:
Arr_Time:
Airline_company:
Bus schedule enquiry domain
U2: fill_form_info: i wanted to take the 28X bus from /um/ DepLoc:[forbes avenue]
to ArLoc:[the airport]
F: Query_Departure_Time
Depart_Location: forbes avenue
Arrive_Location: the airport
Arrive_Time:
Bus_Number: 28X
Outline


Introduction
Form-based dialog structure



Task structure
Dialog mechanisms
Dialog structure learning



Concept identification and clustering
Form identification
Operation Classification
Dialog Reading Group
December 3rd, 2004
Learning framework

Goal: minimize human effort



Use unsupervised learning when possible
Incorporating information from existing knowledge
sources
If additional knowledge from a human is required



Train an initial model with a small amount of annotated
data
Use unsupervised learning or active learning to
selectively explore un-annotated data
A human can correct a mistake
Dialog Reading Group
December 3rd, 2004
Dialog structure components

Domain-dependent -> have to learn in every
domain



Task structure (forms, slots)
Expression for task-oriented operations
Domain-independent -> infrastructure or
have to learn only once


List of operations
Expression for discourse-oriented operations
Dialog Reading Group
December 3rd, 2004
Outline


Introduction
Form-based dialog structure



Task structure
Dialog mechanisms
Dialog structure learning



Concept identification and clustering
Form identification
Operation Classification
Dialog Reading Group
December 3rd, 2004
Concept identification and
clustering

Goal: Identify concept members cluster
together the ones that belong to the
same concept


City={Pittsburgh, Boston, Austin, …}
Assumption:

Word boundaries include compound word
boundaries are given
Dialog Reading Group
December 3rd, 2004
Concept identification steps
1. Identify potential concept members

Filter out noise, function words
2. Cluster similar words together


Statistical-based clustering: Mutual informationbased and Kullback-Liebler-based
Knowledgebase clustering: WordNet
3. Select clusters that represent domain
concepts

Use the same criteria as (1), but work on a
cluster level
Dialog Reading Group
December 3rd, 2004
Outline


Introduction
Form-based dialog structure



Task structure
Dialog mechanisms
Dialog structure learning



Concept identification and clustering
Form identification
Operation Classification
Dialog Reading Group
December 3rd, 2004
Form Identification


Goal: determine different types of forms
that occur in the domain
Assumption:

A dialog may be annotated with concept
labels
Dialog Reading Group
December 3rd, 2004
Approach

Segment a dialog into a sequence of subtasks (form boundaries identification)


Group together the sub-tasks that belong to
the same form type


Train a classifier on lexicon cohesion (Hearst,
1994) and prosodic features
Use unsupervised clustering based on cosine
similarity
Identify a set of slots that associated with
each form type

Analyze a cluster of similar form instances
Dialog Reading Group
December 3rd, 2004
Outline


Introduction
Form-based dialog structure



Task structure
Dialog mechanisms
Dialog structure learning



Concept identification and clustering
Form identification
Operation Classification
Dialog Reading Group
December 3rd, 2004
Operation Classification

Goal: Learn the expressions that associate
with each operation
 by classifying an utterance into a pre-defined set
of operations

Assumption



A dialog may be annotated with concepts labels
List of operation types are given
Operation boundaries are known
Dialog Reading Group
December 3rd, 2004
Supervised classification

Use a Markov model (Woszczyna and Waibel,
1994)




States = operation types
Transition probability = dependency between
operation types
Emission probability = P(W|operation_type)
Enhanced models


Use domain concepts as word classes to reduce a
data sparseness problem
Add prosodic features
Dialog Reading Group
December 3rd, 2004
Unsupervised learning
and active learning
1. Train an initial classifier from human-labeled data
2. Apply the current classifier to an unlabeled
operation


(Unsupervised learning) if the confidence is high, add
this instance and the predicted label into the training set
(Active learning) if the confidence is low, ask a human to
label this instance and then add it into the training set
3. Train a new classifier on all labeled data (both
machined-labeled and human-labeled)
Step 2-3 can be iterated
Dialog Reading Group
December 3rd, 2004
Classifier confidence score
1. Difference in probability between the
first rank and the second rank
2. The entropy of the classifier output
1
H (T )   p (T j | U i ) log
p (T j | U i )
j

High entropy = low confidence
Dialog Reading Group
December 3rd, 2004
Suggestion?
Dialog Reading Group
December 3rd, 2004
Download