Learning the Structure of Task-Oriented Conversations Ananlada Chotimongkol Ph.D. Thesis Defense

advertisement
Learning the Structure of Task-Oriented Conversations
from the Corpus of In-Domain Dialogs
Ph.D. Thesis Defense
Ananlada Chotimongkol
Carnegie Mellon University, 18th December 2007
Thesis Committee:
Alexander Rudnicky (Chair)
William Cohen
Carolyn Penstein Rosé
Gokhan Tur (SRI International)
2
Outline




Introduction
Structure of task-oriented conversations
Machine learning approaches
Conclusion
problem | dialog structure | learning approaches | conclusion
A spoken dialog system
“When would
you like to
leave?”
“I would like to fly to
Seattle tomorrow.”
Speech
Recognizer
Natural
Language
Understanding
Domain
tasks,
steps,
domain
keywords
Knowledge
Dialog
Manager
Speech
Synthesizer
Natural
Language
Generator
3
problem | dialog structure | learning approaches | conclusion
Problems in acquiring domain
knowledge
Domain Knowledge
(tasks, steps,
domain keywords)
Problems:
• Require domain expertise
• Subjective
• May miss some cases
• Time
(Yankelovich,
consuming
1997)
(Bangalore et al., 2006)
example dialogs
4
problem | dialog structure | learning approaches | conclusion
Task-oriented
Client:
Agent :
Client:
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Agent :
• Observable structure
• Reflect domain information
dialog
• Observable -> learnable?
I'D LIKE TO FLY TO HOUSTON TEXAS
AND DEPARTING PITTSBURGH ON WHAT DATE ?
step1: reserve a flight
DEPARTING ON FEBRUARY TWENTIETH
...
DO YOU NEED A CAR ?
YEAH
THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAY
step2: reserve a car
OKAY
WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?
YES
...
OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?
YES
AND WHERE AT IN HOUSTON ?
step3: reserve a hotel
/UM/ DOWNTOWN
OKAY
DID YOU HAVE A HOTEL PREFERENCE ?
...
5
problem | dialog structure | learning approaches | conclusion
Proposed solution
Domain Knowledge
(tasks, steps,
domain keywords)
example dialogs
dialog system
human revises
6
problem | dialog structure | learning approaches | conclusion
Learning system output
air travel
dialogs
Domain Knowledge
task = create a travel itinerary
steps = reserve a flight, reserve a hotel,
reserve a car
keywords = airline, city name, date
7
problem | dialog structure | learning approaches | conclusion
Thesis statement
Investigate how to infer domain-specific information
required to build a task-oriented dialog system
from a corpus of in-domain conversations
through an unsupervised learning approach
8
problem | dialog structure | learning approaches | conclusion
Thesis scope (1)
Investigate how to infer domain-specific information
required to build a task-oriented dialog system
from a corpus of in-domain conversations
through an unsupervised learning approach

What to learn: domain-specific information in a taskoriented dialog

A list of tasks and their decompositions
(travel reservation: flight, car, hotel)

Domain keywords
(airline, city name, date)
9
problem | dialog structure | learning approaches | conclusion
Thesis scope (2)
Investigate how to infer domain-specific information
required to build a task-oriented dialog system
from a corpus of in-domain conversations
through an unsupervised learning approach

Resources: a corpus of in-domain conversations

Recorded human-human conversations are already
available
10
problem | dialog structure | learning approaches | conclusion
Thesis scope (3)
Investigate how to infer domain-specific information
required to build a task-oriented dialog system
from a corpus of in-domain conversations
through an unsupervised learning approach

Learning approach: unsupervised learning


No training data available for a new domain
Annotating data is time consuming
11
problem | dialog structure | learning approaches | conclusion
12
Proposed approach
Investigate how to infer domain-specific information
required to build a task-oriented dialog system
from a corpus of in-domain conversations
through an unsupervised learning approach

2 research problems
1. Specify a suitable domain-specific information representation
2. Develop a learning approach that infers domain information
captured by this representation from human-human dialogs
13
Outline


Introduction
Structure of task-oriented conversations





Properties of a suitable dialog structure
Form-based dialog structure representation
Evaluation
Machine learning approaches
Conclusion
problem | dialog structure : properties| learning approaches | conclusion
14
Properties of a desired dialog structure

Sufficiency


Generality (domain-independent)


Capture all domain-specific information required
to build a task-oriented dialog system
Able to describe task-oriented dialogs in
dissimilar domains and types
Learnability

Can be identified by an unsupervised machine
learning algorithm
problem | dialog structure : properties | learning approaches | conclusion
15
Domain-specific information
in task-oriented dialogs

A list of tasks and their decompositions



Ex: travel reservation = flight + car + hotel
A compositional structure of a dialog based on the
characteristics of a task
Domain keywords


Ex: airline, city name, date
The actual content of a dialog
16
problem | dialog structure : properties | learning approaches | conclusion
Existing discourse structures
Discourse structure
Segmented Discourse
Representation Theory
(Asher, 1993)
Sufficiency
Generality Learnability
Focus on
meaning not
actual entities
?
?
Grosz and Sidner’s Theory Doesn’t model
(Grosz and Sidner, 1986)
domain
keywords

unsupervised?
Doesn’t model a
compositional
structure
?
unsupervised?


unsupervised?
DAMSL extension (Hardy
et al., 2003)
A plan-based model
(Cohen and Perrault, 1979)
problem | dialog structure : form-based | learning approaches | conclusion
17
Form-based
dialog structure representation

Based on a notion of form (Ferrieux and Sadek, 1994)


A data representation used in the form-based dialog system
architecture
Focus only on concrete information

Can be observed directly from in-domain conversations
problem | dialog structure : form-based | learning approaches | conclusion
Form-based representation
components

Consists of 3 components
1. Task
2. Sub-task
3. Concept
18
Form-based representation components
1. Task

Client:
Agent :
Client:
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Agent :
A subset of a dialog that has a specific goal
I'D LIKE TO FLY TO HOUSTON TEXAS
AND DEPARTING PITTSBURGH ON WHAT DATE ?
DEPARTING ON FEBRUARY TWENTIETH
...
DO YOU NEED A CAR ?
YEAH
THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAY
OKAY
make
a travel reservation
WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU
?
YES
...
OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?
YES
AND WHERE AT IN HOUSTON ?
/UM/ DOWNTOWN
OKAY
DID YOU HAVE A HOTEL PREFERENCE ?
...
Form-based representation components
2. Sub-task


Client:
Agent :
Client:
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Agent :
A step in a task that contributes toward the goal
Contains sufficient information to execute a domain
action
I'D LIKE TO FLY TO HOUSTON TEXAS
AND DEPARTING PITTSBURGH ON WHAT DATE ?
reserve a flight
DEPARTING ON FEBRUARY TWENTIETH
...
DO YOU NEED A CAR ?
YEAH
THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAY
reserve a car
OKAY
WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?
YES
...
OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?
YES
AND WHERE AT IN HOUSTON ?
reserve a hotel
/UM/ DOWNTOWN
OKAY
DID YOU HAVE A HOTEL PREFERENCE ?
...
Form-based representation components
3. Concept (domain keywords)

Client:
Agent :
Client:
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Agent :
A piece of information required to perform an action
I'D LIKE TO FLY TO HOUSTON TEXAS
AND DEPARTING PITTSBURGH ON WHAT DATE ?
DEPARTING ON FEBRUARY TWENTIETH
...
DO YOU NEED A CAR ?
YEAH
THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAY
OKAY
WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?
YES
...
OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?
YES
AND WHERE AT IN HOUSTON ?
/UM/ DOWNTOWN
OKAY
DID YOU HAVE A HOTEL PREFERENCE ?
...
problem | dialog structure : form-based | learning approaches | conclusion
Data representation

Represented by a form

A repository of related pieces of information
necessary for performing an action
22
Data representation


Form = a repository of related pieces of information
Sub-task: contains sufficient information to execute a
domain action  a form
Client:
Agent :
Client:
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Agent :
I'D LIKE TO FLY TO HOUSTON TEXAS
AND DEPARTING PITTSBURGH ON WHAT DATE ?
reserve a flight
DEPARTING ON FEBRUARY TWENTIETH
...
DO YOU NEED A CAR ?
YEAH
Form: flight query
THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAY
OKAY
WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?
YES
...
OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?
YES
AND WHERE AT IN HOUSTON ?
/UM/ DOWNTOWN
OKAY
DID YOU HAVE A HOTEL PREFERENCE ?
...
Data representation


Form = a repository of related pieces of information
Task: a subset of a dialog that has a specific goal
 a set of forms
Client:
Agent :
Client:
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Agent :
I'D LIKE TO FLY TO HOUSTON TEXAS
Form: flight query
AND DEPARTING PITTSBURGH ON WHAT DATE ?
DEPARTING ON FEBRUARY TWENTIETH
...
DO YOU NEED A CAR ?
YEAH
Form: car query
THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAY
OKAY
WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?
YES
...
OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?
Form: hotel query
YES
AND WHERE AT IN HOUSTON ?
/UM/ DOWNTOWN
OKAY
DID YOU HAVE A HOTEL PREFERENCE ?
...
Data representation


Form = a repository of related pieces of information
Concept: a piece of information required to perform an
action  a slot
Client:
Agent :
Client:
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Agent :
I'D LIKE TO FLY TO HOUSTON TEXAS
AND DEPARTING PITTSBURGH ON WHAT DATE ?
DEPARTING ON FEBRUARY TWENTIETH
Form: flight query
...
DO YOU NEED A CAR ?
DepartCity: Pittsburgh
YEAH
ArriveCity:
Houston
THE LEAST EXPENSIVE RATE I HAVE WOULD
BE WITH THRIFTY
RENTAL CAR
FOR TWENTY THREE NINETY A DAY
ArriveState: Texas
OKAY
DepartDate: February twentieth
WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?
YES
...
OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?
YES
AND WHERE AT IN HOUSTON ?
/UM/ DOWNTOWN
OKAY
DID YOU HAVE A HOTEL PREFERENCE ?
...
problem | dialog structure : form-based | learning approaches | conclusion
26
Form-based representation properties

Sufficiency
 The form is already used in a form-based dialog system



Philips train timetable system (Aust et al., 1995)
CMU Communicator system (Rudnicky et al., 1999)
Generality (domain-independent)
 A broader interpretation of the form is provided
 The analysis of six dissimilar domains

Learnability
 Components are observable directly from a dialog
 (by human) annotation scheme reliability
 (by machine) the accuracy of the domain information
learned by the proposed approaches
27
Outline


Introduction
Structure of task-oriented conversations



Properties of a suitable dialog structure
Form-based dialog structure representation
Evaluation




Dialog structure analysis (generality)
Annotation experiment (human learnability)
Machine learning approaches
Conclusion
problem | dialog structure : analysis | learning approaches | conclusion
Dialog structure analysis

Goal:


To verify that form-based representation can be
applied to dissimilar domains
Approach:

Analyze 6 task-oriented domains






Air travel planning (information-accessing task)
Bus schedule inquiry (information-accessing task)
Map reading (problem-solving task)
UAV flight simulation (command-and-control task)
Meeting (personnel resource management)
Tutoring (physics essay revising)
28
Map reading domain
route giver
route follower
problem | dialog structure : analysis | learning approaches | conclusion
Map reading domain
(problem-solving task)

Task: draw a route on a map

Sub-task: draw a segment of a route

Concepts:
StartLocation = {White_Mountain, Machete, …}
Direction = {down, left, …}
Distance = {a couple of centimeters, an inch, …}

Sub-task: ground a landmark

Concepts:
LandmarkName = {White_Mountain, Machete, …}
Location = {below the start, …}
30
Dialog structure analysis
(map reading domain)
GIVER
1: okay ... ehm ... right, you have the start?
FOLLOWER
2: yeah. (action: (implicit) define_a_landmark)
GIVER
3: right, below the start do you have ... er like a missionary camp?
FOLLOWER
4: yeah. (action: define_a_landmark)
GIVER
5: okay, well ... if you take it from the start just run ... horizontally.
FOLLOWER
6: uh-huh.
Form: grounding
GIVER
missionary camp
7: eh to the left for about anLandmarkName:
inch.
FOLLOWER
8: right. (action: draw_a_segment)
GIVER
9: and then go down along the side of the missionary camp.
Location: below the start
FOLLOWER 10: uh-huh.
GIVER
Location:
11: 'til you're about an Start
inch ...
above thestart
bottom of the map.
Direction: left
Distance: an inch
13: then you need to go
straight along for about 'til about ...
Path:
End Location:
FOLLOWER 12: right.
GIVER
Form: segment description
problem | dialog structure : analysis | learning approaches | conclusion
UAV flight simulation domain
(command-and-control task)
Task: take photos of the targets


Sub-task: take a photo of each target

Sub-subtask: control a plane


Concepts:
Altitude = {2700, 3300, …}
Speed = {50 knots, 200 knots, …}
Destination = {H-area, SSTE, …}
Sub-subtask: ground a landmark

1.
Concepts:
LandmarkName = {H-area, SSTE, …}
LandmarkType = {target, waypoint}
32
problem | dialog structure : analysis | learning approaches | conclusion
Meeting domain

Task: manage resources for a new employee

Sub-task: get a computer



Concepts:
Type = {desktop, laptop, …}
Brand = {IBM, Dell, …}
Sub-task: get office space
Sub-task: create an action item

Concepts:
Description = {have a space, …}
Person = {Hardware Expert, Building Expert, …}
StartDate = {today, …}
EndDate = {the fourteenth of december, …}
33
problem | dialog structure : analysis | learning approaches | conclusion
34
Characteristics
of form-based representation

Focus only on concrete information



Describe a dialog with a simple model
Pros:


That is observable directly from in-domain conversations
Possible to be learned by an unsupervised learning approach
Cons:

Can’t capture information that is not clearly expressed in a
dialog

Omitted concept values
 Nevertheless, 93% of dialog content can be accounted for
 Can’t model a complex dialog that has a dynamic structure

A tutoring domain
 But it is good enough for many real world applications
problem | dialog structure : analysis | learning approaches | conclusion
35
Form-based representation properties
(revisit)

Sufficiency
 The form is already used in a form-based dialog system
 Can account for 93% of dialog content

Generality (domain-independent)
 A broader interpretation of the form representation is
provided
 Can represent 5 out of 6 disparate domains

Learnability
 Components are observable directly from a dialog
 (by human) annotation scheme reliability
 (by machine) the accuracy of the domain information
learned by the proposed approaches
problem | dialog structure : annotation experiment | learning approaches | conclusion
Annotation experiment

Goal


Approach


To verify that the form-based representation can be
understood and applied by other annotators
Conduct an annotation experiment with non-expert
annotators
Evaluation


Similarity between annotations
Accuracy of annotations
36
problem | dialog structure : annotation experiment | learning approaches | conclusion
37
Challenges in annotation comparison

Different tagsets may be used since annotators
have to design theirs own tagsets
Annotator 1
Annotator 2
<NoOfStop>
-
<DestinationCity>
<DestinationLocation><City>
<Date>
<DepartureDate> and <ArrivalDate>
 Some differences are acceptable if they conform
to the guideline

Different dialog structure designs can generate
dialog systems with the same functionalities
Cross-annotator correction



Each annotator creates his/her own tagset and then annotate dialogs
Each annotator critiques and corrects another annotator’s work
Compare the original annotation with the corrected one
Annotator 1
tagset
annotates
original
annotation
(dialog A)
1
Annotator 2
correct
cross-annotator
corrected
annotation
(dialog A)
comparison
direct
comparison
Annotator 2
tagset
2
annotates
original
annotation
Annotator 1
corrects
cross-annotator
(dialog A)
corrected
annotation
(dialog A)
comparison
problem | dialog structure : annotation experiment | learning approaches | conclusion
Annotation experiment

2 domains



4 subjects in each domain


Air travel planning domain (information-accessing task)
Map reading domain (problem-solving task)
People who are likely to use the form-based representation
in the future
Each subject has to


Design a tagset and annotate the structure of dialogs
Critique other subjects’ annotation on the same set of
dialogs
39
40
Evaluation metrics

Annotation similarity

Acceptability is the degree to which an original
annotation is acceptable to a corrector

Annotation accuracy

Accuracy is the degree to which a subject’s
annotation is acceptable to an expert
41
problem | dialog structure : annotation experiment | learning approaches | conclusion
Annotation results
Concept
Annotation
Air
Travel
Map
Reading
Task/subtask
Air
Map
Annotation
Travel Reading
acceptability
0.96
0.95
acceptability
0.81
0.84
accuracy
0.97
0.89
accuracy
0.90
0.65

High acceptability and accuracy


Except task/sub-task accuracy in map reading domain
Concepts can be annotated more reliably than tasks and
sub-tasks


Smaller units
Have to be communicated clearly
problem | dialog structure : annotation experiment | learning approaches | conclusion
42
Form-based representation properties
(revisit)

Sufficiency
 The form is already used in a form-based dialog system
 Can account for 93% of dialog content

Generality (domain-independent)
 A broader interpretation of the form representation is
provided
 Can represent 5 out of 6 disparate domains

Learnability
 Components are observable directly from a dialog
 Can be applied reliably by other annotators in most of the
cases
 (by machine) the accuracy of the domain information
learned by the proposed approaches
43
Outline




Introduction
Structure of task-oriented conversations
Machine learning approaches
Conclusion
problem | dialog structure | learning approaches | conclusion
Overview of learning approaches

Divide into 2 sub-problems
1. Concept identification


What are the concepts?
What are their members?
2. Form identification


What are the forms?
What are the slots (concepts) in each form?
 Use unsupervised learning approaches

Acquisition (not recognition) problem
44
problem | dialog structure | learning approaches | conclusion
Learning example
Client:
Agent :
Client:
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Client :
Agent :
Agent :
45
Form: flight query
DepartCity: Pittsburgh
I'D LIKE TO FLY TO HOUSTON TEXAS
AND DEPARTING PITTSBURGH ON WHAT DATE ? ArriveCity: Houston
ArriveState: Texas
DEPARTING ON FEBRUARY TWENTIETH
...
ArriveAirport: Intercontinental
DO YOU NEED A CAR ?
YEAH
Form:
car query
THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH
THRIFTY
RENTAL CAR
FOR TWENTY THREE NINETY A DAY
Pick up location: Houston
OKAY
WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU
?
Pickup
Time:
YES
Return Time:
...
OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?
YES
Form: hotel query
AND WHERE AT IN HOUSTON ?
/UM/ DOWNTOWN
City: Houston
OKAY
DID YOU HAVE A HOTEL PREFERENCE ?
Area: Downtown
...
HotelName:
46
Outline



Introduction
Structure of task-oriented conversations
Machine learning approaches



Concept identification
Form identification
Conclusion
problem | dialog structure | learning approaches : concept identification | conclusion
47
Concept identification

Goal: Identify domain concepts and their
members



City={Pittsburgh, Boston, Austin, …}
Month={January, February, March, …}
Approach: word clustering algorithm

Identify concept words and group the similar ones
into the same cluster
problem | dialog structure | learning approaches : concept identification | conclusion
Word clustering algorithms

Use word co-occurrences statistics



Mutual information (MI-based)
Kullback-Liebler distance (KL-based)
Iterative algorithms need a stopping criteria

Use information that is available during the
clustering process



Mutual information (MI-based)
Distance between clusters (KL-based)
Number of clusters
48
problem | dialog structure | learning approaches : concept identification | conclusion
49
Clustering evaluation

Allow more than one cluster to represent a
concept



To discover as many concept words as possible
However, the clustering result that doesn’t contain
splited concepts is preferred
Quality score (QS) = harmonic mean of



Precision (purity)
Recall (completeness)
Singularity Score (SS)
SS of conceptj =
1
# clusters labeled as concept j
problem | dialog structure | learning approaches : concept identification | conclusion
Concept clustering results
Algorithm Precision
Recall
SS
QS
MaxQS
MI-based
0.78
0.43
0.77
0.61
0.68
KL-based
0.86
0.60
0.70
0.70
0.71
 Domain concepts can be identified with acceptable
accuracy

Example clusters



{GATWICK, CINCINNATI, PHILADELPHIA, L.A., ATLANTA}
{HERTZ, BUDGET, THRIFTY}
Low recall for infrequent concepts
 An automatic stopping criterion yields close to
optimal results
50
51
Outline



Introduction
Structure of task-oriented conversations
Machine learning approaches



Concept identification
Form identification
Conclusion
problem | dialog structure | learning approaches : form identification | conclusion
52
Form Identification


Goal: determine different types of forms and
their associated slots
Approach:
1. Segment a dialog into a sequence of sub-tasks
 Dialog segmentation
2. Group the sub-tasks that associate with the same
form type into a cluster
 Sub-task clustering
3. Identify a set of slots associated with each form
type
 Slot extraction
problem | dialog structure | learning approaches : form identification | conclusion
Step1: dialog segmentation

Goal: segment a dialog into a sequence of sub-tasks


Equivalent to identify sub-task boundaries
Approach:

TextTiling algorithm (Hearst, 1997)


Based on lexical cohesion assumption (local context)
HMM-based segmentation algorithm




Based on recurring patterns (global context)
HMM states = topics (sub-tasks)
Transition probability = probabilities of topic shifts
Emission probability = a state-specific language model
53
54
Modeling HMM states

HMM states = topics (sub-tasks)

Induced by clustering reference topics (Tür et al.,
2001)


Need annotated data
Utterance-based HMM (Barzilay and Lee, 2004)

Some utterances are very short
 Induced by clustering predicted segments from
TextTiling
problem | dialog structure | learning approaches : form identification | conclusion
Modifications for fine-grained
segments in spoken dialogs

Average segment length




Air travel domain = 84 words
Map reading domain = 55 words
(WSJ = 428, Broadcast News = 996)
Modifications include:

A data-driven stop word list


Reflect the characteristics of spoken dialogs
A distance weight

Higher weight for the context closer to candidate
boundary
55
problem | dialog structure | learning approaches : form identification | conclusion
56
Dialog segmentation experiment

Evaluation metrics

Pk (Beeferman et al., 1999)



Concept-based F-measure (C. F-1)



Probabilistic error metric
Sensitive to the value of k
F-measure (or F-1) is a harmonic mean of precision and
recall
Count a near miss as a match if there is no concept in
between
Incorporate concept information in word token
representation


A concept label + its value -> [Airline]:northwest
A concept label
-> [Airline]
problem | dialog structure | learning approaches : form identification | conclusion
TextTiling results
Algorithm
Air Travel
Pk
C. F-1
Map Reading
Pk
C. F-1
TextTiling (baseline)
0.387
0.621
0.412
0.396
TextTiling (augmented)
0.371
0.712
0.384
0.464

Augmented TextTiling is significantly better than the
baseline
57
58
problem | dialog structure | learning approaches : form identification | conclusion
HMM-based segmentation results
Air Travel
Algorithm
Pk
C. F-1
Map Reading
Pk
C. F-1
HMM-based (utterance)
0.398
0.624
0.392
0.436
HMM-based (segment)
0.385
0.698
0.355
0.507
HMM-based (segment + label)
0.386
0.706
0.250
0.686
TextTiling (augmented)
0.371
0.712
0.384
0.464


Inducing HMM states from predicted segments is better
than inducing from utterances
Abstract concept representation yields better results


Especially on map reading domain
HMM-based is significantly better than TextTiling on map
reading domain
problem | dialog structure | learning approaches : form identification | conclusion
Segmentation error analysis


TextTiling algorithm performs better on
consecutive sub-tasks of the same type
HMM-based algorithm performs better on
very fine-grained segments (only 2-3
utterances long)

Map reading domain
59
problem | dialog structure | learning approaches : form identification | conclusion
Step2: sub-task clustering

Approach



Bisecting K-mean clustering algorithm
Incorporate concept information in word token
representation
Evaluation metrics

Similar to concept clustering
60
61
problem | dialog structure | learning approaches : form identification | conclusion
Sub-task clustering results
Concept Word Representation
Air Travel Map Reading
concept label + value (oracle segment)
0.738
0.791
concept label + value
0.577
0.675
concept label
0.601
0.823

Inaccurate segment boundaries affect clustering
performance



But don’t affect frequent sub-tasks much
Missing boundaries are more problematic than false alarms
Abstract concept representation yields better results
More improvement in the map reading domain
 Even better than using reference segments
 Appropriate feature representation is better than accurate
segment boundaries

problem | dialog structure | learning approaches : form identification | conclusion
Step3: Slot extraction

Goal:


Identify a set of slots associated with each form
type
Approach:

Analyze concepts contained in each cluster
62
63
problem | dialog structure | learning approaches : form identification | conclusion
Slot extraction results

Concepts are sorted by frequency
Form: flight query
Airline
ArriveTimeMin
DepartTimeHour
DepartTimeMin
ArriveTimeHour
ArriveCity
FlightNumber
ArriveAirport
DepartCity
DepartTimePeriod
(79)
(46)
(40)
(39)
(36)
(27)
(15)
(13)
(13)
(11)
Form: flight fare query
Form: car query
Fare
City
CarRentalCompany
HotelName
ArriveCity
AirlineCompany
car_type
city
state
(257)
(27)
(17)
(15)
(14)
(11)
Form: hotel query
Fare
City
HotelName
Area
ArriveDateMonth
(75)
(36)
(33)
(28)
(14)
(13)
(3)
(1)
64
Outline



Introduction
Structure of task-oriented conversations
Machine learning approaches



Concept identification and clustering
Form identification
Conclusion
problem | dialog structure | learning approaches | conclusion
65
Form-based dialog structure
representation

Forms are a suitable domain-specific information
representation according to these criteria
 Sufficiency
 Can account for 93% of dialog content
 Generality (domain-independent)
 A broader interpretation of the form representation is
provided
 Can represent 5 out of 6 disparate domains
 Learnability
 (human) can be applied reliably by other annotators in
most of the cases
 (machine) can be identified with acceptable accuracy using
unsupervised machine learning approaches
66
Unsupervised learning approaches
for inferring domain information


Require some modifications in order to learn the
structure of a spoken dialog
Can identify components in form-based
representation with acceptable accuracy




Concept accuracy, QS = 0.70
Sub-task boundary accuracy, F-1 = 0.71 (air travel),
= 0.69 (map reading)
Form type accuracy, QS = 0.60 (air travel),
= 0.82 (map reading)
Can learn with inaccurate information
If the number of errors is moderate
 Propagated errors don’t affect frequent components much
 Dialog structure acquisition doesn’t require high learning
accuracy

67
Conclusion


To represent a dialog for a learning purpose we
based our representation on an observable structure
This observable representation




Can be generalize for various types of task-oriented dialog
Can be understood and applied by different annotators
Can be learned by unsupervised learning approach
The result from this investigation can be apply for



Acquiring domain knowledge in a new task
Exploring the structure of a dialog
Could potentially reduce human effort when developing a
new dialog system
Thank you
Question & Comment
69
References (1)








N. Asher. 1993. Reference to Abstract Objects in Discourse. Dordrecht, the
Netherlands: Kluwer Academic Publishers.
H. Aust, M. Oerder, F. Seide, and V. Steinbiss. 1995. The Philips automatic train
timetable information system. Speech Communication, 17(3-4):249-262.
S. Bangalore, G. D. Fabbrizio, and A. Stent. 2006. Learning the Structure of TaskDriven Human-Human Dialogs. In Proceedings of COLING/ACL 2006. Sydney,
Australia.
R. Barzilay and L. Lee. 2004. Catching the Drift: Probabilistic Content Models, with
Applications to Generation and Summarization. In HLT-NAACL 2004: Proceedings of
the Main Conference, pp. 113-120. Boston, MA.
D. Beeferman, A. Berger, and J. Lafferty. 1999. Statistical Models for Text
Segmentation. Machine Learning, 34(1-3):177-210.
P. R. Cohen and C. R. Perrault. 1979. Elements of a plan-based theory of speech
acts. Cognitive Science, 3:177-212.
A. Ferrieux and M. D. Sadek. 1994. An Efficient Data-Driven Model for Cooperative
Spoken Dialogue. In Proceedings of ICSLP 1994. Yokohama, Japan.
B. J. Grosz and C. L. Sidner. 1986. Attention, intentions, and the structure of
discourse. Computational Linguistics, 12(3):175-204.
70
References (2)








H. Hardy, K. Baker, H. Bonneau-Maynard, L. Devillers, S. Rosset, and T.
Strzalkowski. 2003. Semantic and Dialogic Annotation for Automated Multilingual
Customer Service. In Proceedings of Eurospeech 2003. Geneva, Switzerland.
M. A. Hearst. 1997. TextTiling: segmenting text into multi-paragraph subtopic
passages. Computational Linguistics, 23(1):33-64.
W. C. Mann and S. A. Thompson. 1988. Rhetorical Structure Theory: Toward a
functional theory of text organization. Text, 8(3):243-281.
L. Polanyi. 1996. The Linguistic Structure of Discourse, Technical Report CSLI-96200. Stanford CA, Center for the Study of Language and Information, Stanford
University.
A. I. Rudnicky, E. Thayer, P. Constantinides, C. Tchou, R. Shern, K. Lenzo, X. W.,
and A. Oh. 1999. Creating natural dialogs in the Carnegie Mellon Communicator
system. In Proceedings of Eurospeech 1999. Budapest, Hungary.
J. M. Sinclair and M. Coulthard. 1975. Towards an analysis of Discourse: The English
used by teachers and pupils: Oxford University Press.
G. Tür, A. Stolcke, D. Hakkani-Tür, and E. Shriberg. 2001. Integrating prosodic and
lexical cues for automatic topic segmentation. Computational Linguistics, 27(1):3157.
N. Yankelovich. 1997. Using Natural Dialogs as the Basis for Speech Interface
Design. In Susann Luperfoy (Ed.), Automated Spoken Dialog Systems. Cambridge,
MA: MIT Press.
Download