Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain Dialogs Ph.D. Thesis Defense Ananlada Chotimongkol Carnegie Mellon University, 18th December 2007 Thesis Committee: Alexander Rudnicky (Chair) William Cohen Carolyn Penstein Rosé Gokhan Tur (SRI International) 2 Outline Introduction Structure of task-oriented conversations Machine learning approaches Conclusion problem | dialog structure | learning approaches | conclusion A spoken dialog system “When would you like to leave?” “I would like to fly to Seattle tomorrow.” Speech Recognizer Natural Language Understanding Domain tasks, steps, domain keywords Knowledge Dialog Manager Speech Synthesizer Natural Language Generator 3 problem | dialog structure | learning approaches | conclusion Problems in acquiring domain knowledge Domain Knowledge (tasks, steps, domain keywords) Problems: • Require domain expertise • Subjective • May miss some cases • Time (Yankelovich, consuming 1997) (Bangalore et al., 2006) example dialogs 4 problem | dialog structure | learning approaches | conclusion Task-oriented Client: Agent : Client: Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Agent : • Observable structure • Reflect domain information dialog • Observable -> learnable? I'D LIKE TO FLY TO HOUSTON TEXAS AND DEPARTING PITTSBURGH ON WHAT DATE ? step1: reserve a flight DEPARTING ON FEBRUARY TWENTIETH ... DO YOU NEED A CAR ? YEAH THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY step2: reserve a car OKAY WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? YES ... OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? YES AND WHERE AT IN HOUSTON ? step3: reserve a hotel /UM/ DOWNTOWN OKAY DID YOU HAVE A HOTEL PREFERENCE ? ... 5 problem | dialog structure | learning approaches | conclusion Proposed solution Domain Knowledge (tasks, steps, domain keywords) example dialogs dialog system human revises 6 problem | dialog structure | learning approaches | conclusion Learning system output air travel dialogs Domain Knowledge task = create a travel itinerary steps = reserve a flight, reserve a hotel, reserve a car keywords = airline, city name, date 7 problem | dialog structure | learning approaches | conclusion Thesis statement Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach 8 problem | dialog structure | learning approaches | conclusion Thesis scope (1) Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach What to learn: domain-specific information in a taskoriented dialog A list of tasks and their decompositions (travel reservation: flight, car, hotel) Domain keywords (airline, city name, date) 9 problem | dialog structure | learning approaches | conclusion Thesis scope (2) Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach Resources: a corpus of in-domain conversations Recorded human-human conversations are already available 10 problem | dialog structure | learning approaches | conclusion Thesis scope (3) Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach Learning approach: unsupervised learning No training data available for a new domain Annotating data is time consuming 11 problem | dialog structure | learning approaches | conclusion 12 Proposed approach Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach 2 research problems 1. Specify a suitable domain-specific information representation 2. Develop a learning approach that infers domain information captured by this representation from human-human dialogs 13 Outline Introduction Structure of task-oriented conversations Properties of a suitable dialog structure Form-based dialog structure representation Evaluation Machine learning approaches Conclusion problem | dialog structure : properties| learning approaches | conclusion 14 Properties of a desired dialog structure Sufficiency Generality (domain-independent) Capture all domain-specific information required to build a task-oriented dialog system Able to describe task-oriented dialogs in dissimilar domains and types Learnability Can be identified by an unsupervised machine learning algorithm problem | dialog structure : properties | learning approaches | conclusion 15 Domain-specific information in task-oriented dialogs A list of tasks and their decompositions Ex: travel reservation = flight + car + hotel A compositional structure of a dialog based on the characteristics of a task Domain keywords Ex: airline, city name, date The actual content of a dialog 16 problem | dialog structure : properties | learning approaches | conclusion Existing discourse structures Discourse structure Segmented Discourse Representation Theory (Asher, 1993) Sufficiency Generality Learnability Focus on meaning not actual entities ? ? Grosz and Sidner’s Theory Doesn’t model (Grosz and Sidner, 1986) domain keywords unsupervised? Doesn’t model a compositional structure ? unsupervised? unsupervised? DAMSL extension (Hardy et al., 2003) A plan-based model (Cohen and Perrault, 1979) problem | dialog structure : form-based | learning approaches | conclusion 17 Form-based dialog structure representation Based on a notion of form (Ferrieux and Sadek, 1994) A data representation used in the form-based dialog system architecture Focus only on concrete information Can be observed directly from in-domain conversations problem | dialog structure : form-based | learning approaches | conclusion Form-based representation components Consists of 3 components 1. Task 2. Sub-task 3. Concept 18 Form-based representation components 1. Task Client: Agent : Client: Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Agent : A subset of a dialog that has a specific goal I'D LIKE TO FLY TO HOUSTON TEXAS AND DEPARTING PITTSBURGH ON WHAT DATE ? DEPARTING ON FEBRUARY TWENTIETH ... DO YOU NEED A CAR ? YEAH THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY OKAY make a travel reservation WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? YES ... OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? YES AND WHERE AT IN HOUSTON ? /UM/ DOWNTOWN OKAY DID YOU HAVE A HOTEL PREFERENCE ? ... Form-based representation components 2. Sub-task Client: Agent : Client: Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Agent : A step in a task that contributes toward the goal Contains sufficient information to execute a domain action I'D LIKE TO FLY TO HOUSTON TEXAS AND DEPARTING PITTSBURGH ON WHAT DATE ? reserve a flight DEPARTING ON FEBRUARY TWENTIETH ... DO YOU NEED A CAR ? YEAH THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY reserve a car OKAY WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? YES ... OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? YES AND WHERE AT IN HOUSTON ? reserve a hotel /UM/ DOWNTOWN OKAY DID YOU HAVE A HOTEL PREFERENCE ? ... Form-based representation components 3. Concept (domain keywords) Client: Agent : Client: Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Agent : A piece of information required to perform an action I'D LIKE TO FLY TO HOUSTON TEXAS AND DEPARTING PITTSBURGH ON WHAT DATE ? DEPARTING ON FEBRUARY TWENTIETH ... DO YOU NEED A CAR ? YEAH THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY OKAY WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? YES ... OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? YES AND WHERE AT IN HOUSTON ? /UM/ DOWNTOWN OKAY DID YOU HAVE A HOTEL PREFERENCE ? ... problem | dialog structure : form-based | learning approaches | conclusion Data representation Represented by a form A repository of related pieces of information necessary for performing an action 22 Data representation Form = a repository of related pieces of information Sub-task: contains sufficient information to execute a domain action a form Client: Agent : Client: Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Agent : I'D LIKE TO FLY TO HOUSTON TEXAS AND DEPARTING PITTSBURGH ON WHAT DATE ? reserve a flight DEPARTING ON FEBRUARY TWENTIETH ... DO YOU NEED A CAR ? YEAH Form: flight query THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY OKAY WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? YES ... OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? YES AND WHERE AT IN HOUSTON ? /UM/ DOWNTOWN OKAY DID YOU HAVE A HOTEL PREFERENCE ? ... Data representation Form = a repository of related pieces of information Task: a subset of a dialog that has a specific goal a set of forms Client: Agent : Client: Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Agent : I'D LIKE TO FLY TO HOUSTON TEXAS Form: flight query AND DEPARTING PITTSBURGH ON WHAT DATE ? DEPARTING ON FEBRUARY TWENTIETH ... DO YOU NEED A CAR ? YEAH Form: car query THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY OKAY WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? YES ... OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Form: hotel query YES AND WHERE AT IN HOUSTON ? /UM/ DOWNTOWN OKAY DID YOU HAVE A HOTEL PREFERENCE ? ... Data representation Form = a repository of related pieces of information Concept: a piece of information required to perform an action a slot Client: Agent : Client: Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Agent : I'D LIKE TO FLY TO HOUSTON TEXAS AND DEPARTING PITTSBURGH ON WHAT DATE ? DEPARTING ON FEBRUARY TWENTIETH Form: flight query ... DO YOU NEED A CAR ? DepartCity: Pittsburgh YEAH ArriveCity: Houston THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY ArriveState: Texas OKAY DepartDate: February twentieth WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? YES ... OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? YES AND WHERE AT IN HOUSTON ? /UM/ DOWNTOWN OKAY DID YOU HAVE A HOTEL PREFERENCE ? ... problem | dialog structure : form-based | learning approaches | conclusion 26 Form-based representation properties Sufficiency The form is already used in a form-based dialog system Philips train timetable system (Aust et al., 1995) CMU Communicator system (Rudnicky et al., 1999) Generality (domain-independent) A broader interpretation of the form is provided The analysis of six dissimilar domains Learnability Components are observable directly from a dialog (by human) annotation scheme reliability (by machine) the accuracy of the domain information learned by the proposed approaches 27 Outline Introduction Structure of task-oriented conversations Properties of a suitable dialog structure Form-based dialog structure representation Evaluation Dialog structure analysis (generality) Annotation experiment (human learnability) Machine learning approaches Conclusion problem | dialog structure : analysis | learning approaches | conclusion Dialog structure analysis Goal: To verify that form-based representation can be applied to dissimilar domains Approach: Analyze 6 task-oriented domains Air travel planning (information-accessing task) Bus schedule inquiry (information-accessing task) Map reading (problem-solving task) UAV flight simulation (command-and-control task) Meeting (personnel resource management) Tutoring (physics essay revising) 28 Map reading domain route giver route follower problem | dialog structure : analysis | learning approaches | conclusion Map reading domain (problem-solving task) Task: draw a route on a map Sub-task: draw a segment of a route Concepts: StartLocation = {White_Mountain, Machete, …} Direction = {down, left, …} Distance = {a couple of centimeters, an inch, …} Sub-task: ground a landmark Concepts: LandmarkName = {White_Mountain, Machete, …} Location = {below the start, …} 30 Dialog structure analysis (map reading domain) GIVER 1: okay ... ehm ... right, you have the start? FOLLOWER 2: yeah. (action: (implicit) define_a_landmark) GIVER 3: right, below the start do you have ... er like a missionary camp? FOLLOWER 4: yeah. (action: define_a_landmark) GIVER 5: okay, well ... if you take it from the start just run ... horizontally. FOLLOWER 6: uh-huh. Form: grounding GIVER missionary camp 7: eh to the left for about anLandmarkName: inch. FOLLOWER 8: right. (action: draw_a_segment) GIVER 9: and then go down along the side of the missionary camp. Location: below the start FOLLOWER 10: uh-huh. GIVER Location: 11: 'til you're about an Start inch ... above thestart bottom of the map. Direction: left Distance: an inch 13: then you need to go straight along for about 'til about ... Path: End Location: FOLLOWER 12: right. GIVER Form: segment description problem | dialog structure : analysis | learning approaches | conclusion UAV flight simulation domain (command-and-control task) Task: take photos of the targets Sub-task: take a photo of each target Sub-subtask: control a plane Concepts: Altitude = {2700, 3300, …} Speed = {50 knots, 200 knots, …} Destination = {H-area, SSTE, …} Sub-subtask: ground a landmark 1. Concepts: LandmarkName = {H-area, SSTE, …} LandmarkType = {target, waypoint} 32 problem | dialog structure : analysis | learning approaches | conclusion Meeting domain Task: manage resources for a new employee Sub-task: get a computer Concepts: Type = {desktop, laptop, …} Brand = {IBM, Dell, …} Sub-task: get office space Sub-task: create an action item Concepts: Description = {have a space, …} Person = {Hardware Expert, Building Expert, …} StartDate = {today, …} EndDate = {the fourteenth of december, …} 33 problem | dialog structure : analysis | learning approaches | conclusion 34 Characteristics of form-based representation Focus only on concrete information Describe a dialog with a simple model Pros: That is observable directly from in-domain conversations Possible to be learned by an unsupervised learning approach Cons: Can’t capture information that is not clearly expressed in a dialog Omitted concept values Nevertheless, 93% of dialog content can be accounted for Can’t model a complex dialog that has a dynamic structure A tutoring domain But it is good enough for many real world applications problem | dialog structure : analysis | learning approaches | conclusion 35 Form-based representation properties (revisit) Sufficiency The form is already used in a form-based dialog system Can account for 93% of dialog content Generality (domain-independent) A broader interpretation of the form representation is provided Can represent 5 out of 6 disparate domains Learnability Components are observable directly from a dialog (by human) annotation scheme reliability (by machine) the accuracy of the domain information learned by the proposed approaches problem | dialog structure : annotation experiment | learning approaches | conclusion Annotation experiment Goal Approach To verify that the form-based representation can be understood and applied by other annotators Conduct an annotation experiment with non-expert annotators Evaluation Similarity between annotations Accuracy of annotations 36 problem | dialog structure : annotation experiment | learning approaches | conclusion 37 Challenges in annotation comparison Different tagsets may be used since annotators have to design theirs own tagsets Annotator 1 Annotator 2 <NoOfStop> - <DestinationCity> <DestinationLocation><City> <Date> <DepartureDate> and <ArrivalDate> Some differences are acceptable if they conform to the guideline Different dialog structure designs can generate dialog systems with the same functionalities Cross-annotator correction Each annotator creates his/her own tagset and then annotate dialogs Each annotator critiques and corrects another annotator’s work Compare the original annotation with the corrected one Annotator 1 tagset annotates original annotation (dialog A) 1 Annotator 2 correct cross-annotator corrected annotation (dialog A) comparison direct comparison Annotator 2 tagset 2 annotates original annotation Annotator 1 corrects cross-annotator (dialog A) corrected annotation (dialog A) comparison problem | dialog structure : annotation experiment | learning approaches | conclusion Annotation experiment 2 domains 4 subjects in each domain Air travel planning domain (information-accessing task) Map reading domain (problem-solving task) People who are likely to use the form-based representation in the future Each subject has to Design a tagset and annotate the structure of dialogs Critique other subjects’ annotation on the same set of dialogs 39 40 Evaluation metrics Annotation similarity Acceptability is the degree to which an original annotation is acceptable to a corrector Annotation accuracy Accuracy is the degree to which a subject’s annotation is acceptable to an expert 41 problem | dialog structure : annotation experiment | learning approaches | conclusion Annotation results Concept Annotation Air Travel Map Reading Task/subtask Air Map Annotation Travel Reading acceptability 0.96 0.95 acceptability 0.81 0.84 accuracy 0.97 0.89 accuracy 0.90 0.65 High acceptability and accuracy Except task/sub-task accuracy in map reading domain Concepts can be annotated more reliably than tasks and sub-tasks Smaller units Have to be communicated clearly problem | dialog structure : annotation experiment | learning approaches | conclusion 42 Form-based representation properties (revisit) Sufficiency The form is already used in a form-based dialog system Can account for 93% of dialog content Generality (domain-independent) A broader interpretation of the form representation is provided Can represent 5 out of 6 disparate domains Learnability Components are observable directly from a dialog Can be applied reliably by other annotators in most of the cases (by machine) the accuracy of the domain information learned by the proposed approaches 43 Outline Introduction Structure of task-oriented conversations Machine learning approaches Conclusion problem | dialog structure | learning approaches | conclusion Overview of learning approaches Divide into 2 sub-problems 1. Concept identification What are the concepts? What are their members? 2. Form identification What are the forms? What are the slots (concepts) in each form? Use unsupervised learning approaches Acquisition (not recognition) problem 44 problem | dialog structure | learning approaches | conclusion Learning example Client: Agent : Client: Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Client : Agent : Agent : 45 Form: flight query DepartCity: Pittsburgh I'D LIKE TO FLY TO HOUSTON TEXAS AND DEPARTING PITTSBURGH ON WHAT DATE ? ArriveCity: Houston ArriveState: Texas DEPARTING ON FEBRUARY TWENTIETH ... ArriveAirport: Intercontinental DO YOU NEED A CAR ? YEAH Form: car query THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Pick up location: Houston OKAY WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Pickup Time: YES Return Time: ... OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? YES Form: hotel query AND WHERE AT IN HOUSTON ? /UM/ DOWNTOWN City: Houston OKAY DID YOU HAVE A HOTEL PREFERENCE ? Area: Downtown ... HotelName: 46 Outline Introduction Structure of task-oriented conversations Machine learning approaches Concept identification Form identification Conclusion problem | dialog structure | learning approaches : concept identification | conclusion 47 Concept identification Goal: Identify domain concepts and their members City={Pittsburgh, Boston, Austin, …} Month={January, February, March, …} Approach: word clustering algorithm Identify concept words and group the similar ones into the same cluster problem | dialog structure | learning approaches : concept identification | conclusion Word clustering algorithms Use word co-occurrences statistics Mutual information (MI-based) Kullback-Liebler distance (KL-based) Iterative algorithms need a stopping criteria Use information that is available during the clustering process Mutual information (MI-based) Distance between clusters (KL-based) Number of clusters 48 problem | dialog structure | learning approaches : concept identification | conclusion 49 Clustering evaluation Allow more than one cluster to represent a concept To discover as many concept words as possible However, the clustering result that doesn’t contain splited concepts is preferred Quality score (QS) = harmonic mean of Precision (purity) Recall (completeness) Singularity Score (SS) SS of conceptj = 1 # clusters labeled as concept j problem | dialog structure | learning approaches : concept identification | conclusion Concept clustering results Algorithm Precision Recall SS QS MaxQS MI-based 0.78 0.43 0.77 0.61 0.68 KL-based 0.86 0.60 0.70 0.70 0.71 Domain concepts can be identified with acceptable accuracy Example clusters {GATWICK, CINCINNATI, PHILADELPHIA, L.A., ATLANTA} {HERTZ, BUDGET, THRIFTY} Low recall for infrequent concepts An automatic stopping criterion yields close to optimal results 50 51 Outline Introduction Structure of task-oriented conversations Machine learning approaches Concept identification Form identification Conclusion problem | dialog structure | learning approaches : form identification | conclusion 52 Form Identification Goal: determine different types of forms and their associated slots Approach: 1. Segment a dialog into a sequence of sub-tasks Dialog segmentation 2. Group the sub-tasks that associate with the same form type into a cluster Sub-task clustering 3. Identify a set of slots associated with each form type Slot extraction problem | dialog structure | learning approaches : form identification | conclusion Step1: dialog segmentation Goal: segment a dialog into a sequence of sub-tasks Equivalent to identify sub-task boundaries Approach: TextTiling algorithm (Hearst, 1997) Based on lexical cohesion assumption (local context) HMM-based segmentation algorithm Based on recurring patterns (global context) HMM states = topics (sub-tasks) Transition probability = probabilities of topic shifts Emission probability = a state-specific language model 53 54 Modeling HMM states HMM states = topics (sub-tasks) Induced by clustering reference topics (Tür et al., 2001) Need annotated data Utterance-based HMM (Barzilay and Lee, 2004) Some utterances are very short Induced by clustering predicted segments from TextTiling problem | dialog structure | learning approaches : form identification | conclusion Modifications for fine-grained segments in spoken dialogs Average segment length Air travel domain = 84 words Map reading domain = 55 words (WSJ = 428, Broadcast News = 996) Modifications include: A data-driven stop word list Reflect the characteristics of spoken dialogs A distance weight Higher weight for the context closer to candidate boundary 55 problem | dialog structure | learning approaches : form identification | conclusion 56 Dialog segmentation experiment Evaluation metrics Pk (Beeferman et al., 1999) Concept-based F-measure (C. F-1) Probabilistic error metric Sensitive to the value of k F-measure (or F-1) is a harmonic mean of precision and recall Count a near miss as a match if there is no concept in between Incorporate concept information in word token representation A concept label + its value -> [Airline]:northwest A concept label -> [Airline] problem | dialog structure | learning approaches : form identification | conclusion TextTiling results Algorithm Air Travel Pk C. F-1 Map Reading Pk C. F-1 TextTiling (baseline) 0.387 0.621 0.412 0.396 TextTiling (augmented) 0.371 0.712 0.384 0.464 Augmented TextTiling is significantly better than the baseline 57 58 problem | dialog structure | learning approaches : form identification | conclusion HMM-based segmentation results Air Travel Algorithm Pk C. F-1 Map Reading Pk C. F-1 HMM-based (utterance) 0.398 0.624 0.392 0.436 HMM-based (segment) 0.385 0.698 0.355 0.507 HMM-based (segment + label) 0.386 0.706 0.250 0.686 TextTiling (augmented) 0.371 0.712 0.384 0.464 Inducing HMM states from predicted segments is better than inducing from utterances Abstract concept representation yields better results Especially on map reading domain HMM-based is significantly better than TextTiling on map reading domain problem | dialog structure | learning approaches : form identification | conclusion Segmentation error analysis TextTiling algorithm performs better on consecutive sub-tasks of the same type HMM-based algorithm performs better on very fine-grained segments (only 2-3 utterances long) Map reading domain 59 problem | dialog structure | learning approaches : form identification | conclusion Step2: sub-task clustering Approach Bisecting K-mean clustering algorithm Incorporate concept information in word token representation Evaluation metrics Similar to concept clustering 60 61 problem | dialog structure | learning approaches : form identification | conclusion Sub-task clustering results Concept Word Representation Air Travel Map Reading concept label + value (oracle segment) 0.738 0.791 concept label + value 0.577 0.675 concept label 0.601 0.823 Inaccurate segment boundaries affect clustering performance But don’t affect frequent sub-tasks much Missing boundaries are more problematic than false alarms Abstract concept representation yields better results More improvement in the map reading domain Even better than using reference segments Appropriate feature representation is better than accurate segment boundaries problem | dialog structure | learning approaches : form identification | conclusion Step3: Slot extraction Goal: Identify a set of slots associated with each form type Approach: Analyze concepts contained in each cluster 62 63 problem | dialog structure | learning approaches : form identification | conclusion Slot extraction results Concepts are sorted by frequency Form: flight query Airline ArriveTimeMin DepartTimeHour DepartTimeMin ArriveTimeHour ArriveCity FlightNumber ArriveAirport DepartCity DepartTimePeriod (79) (46) (40) (39) (36) (27) (15) (13) (13) (11) Form: flight fare query Form: car query Fare City CarRentalCompany HotelName ArriveCity AirlineCompany car_type city state (257) (27) (17) (15) (14) (11) Form: hotel query Fare City HotelName Area ArriveDateMonth (75) (36) (33) (28) (14) (13) (3) (1) 64 Outline Introduction Structure of task-oriented conversations Machine learning approaches Concept identification and clustering Form identification Conclusion problem | dialog structure | learning approaches | conclusion 65 Form-based dialog structure representation Forms are a suitable domain-specific information representation according to these criteria Sufficiency Can account for 93% of dialog content Generality (domain-independent) A broader interpretation of the form representation is provided Can represent 5 out of 6 disparate domains Learnability (human) can be applied reliably by other annotators in most of the cases (machine) can be identified with acceptable accuracy using unsupervised machine learning approaches 66 Unsupervised learning approaches for inferring domain information Require some modifications in order to learn the structure of a spoken dialog Can identify components in form-based representation with acceptable accuracy Concept accuracy, QS = 0.70 Sub-task boundary accuracy, F-1 = 0.71 (air travel), = 0.69 (map reading) Form type accuracy, QS = 0.60 (air travel), = 0.82 (map reading) Can learn with inaccurate information If the number of errors is moderate Propagated errors don’t affect frequent components much Dialog structure acquisition doesn’t require high learning accuracy 67 Conclusion To represent a dialog for a learning purpose we based our representation on an observable structure This observable representation Can be generalize for various types of task-oriented dialog Can be understood and applied by different annotators Can be learned by unsupervised learning approach The result from this investigation can be apply for Acquiring domain knowledge in a new task Exploring the structure of a dialog Could potentially reduce human effort when developing a new dialog system Thank you Question & Comment 69 References (1) N. Asher. 1993. Reference to Abstract Objects in Discourse. Dordrecht, the Netherlands: Kluwer Academic Publishers. H. Aust, M. Oerder, F. Seide, and V. Steinbiss. 1995. The Philips automatic train timetable information system. Speech Communication, 17(3-4):249-262. S. Bangalore, G. D. Fabbrizio, and A. Stent. 2006. Learning the Structure of TaskDriven Human-Human Dialogs. In Proceedings of COLING/ACL 2006. Sydney, Australia. R. Barzilay and L. Lee. 2004. Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization. In HLT-NAACL 2004: Proceedings of the Main Conference, pp. 113-120. Boston, MA. D. Beeferman, A. Berger, and J. Lafferty. 1999. Statistical Models for Text Segmentation. Machine Learning, 34(1-3):177-210. P. R. Cohen and C. R. Perrault. 1979. Elements of a plan-based theory of speech acts. Cognitive Science, 3:177-212. A. Ferrieux and M. D. Sadek. 1994. An Efficient Data-Driven Model for Cooperative Spoken Dialogue. In Proceedings of ICSLP 1994. Yokohama, Japan. B. J. Grosz and C. L. Sidner. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3):175-204. 70 References (2) H. Hardy, K. Baker, H. Bonneau-Maynard, L. Devillers, S. Rosset, and T. Strzalkowski. 2003. Semantic and Dialogic Annotation for Automated Multilingual Customer Service. In Proceedings of Eurospeech 2003. Geneva, Switzerland. M. A. Hearst. 1997. TextTiling: segmenting text into multi-paragraph subtopic passages. Computational Linguistics, 23(1):33-64. W. C. Mann and S. A. Thompson. 1988. Rhetorical Structure Theory: Toward a functional theory of text organization. Text, 8(3):243-281. L. Polanyi. 1996. The Linguistic Structure of Discourse, Technical Report CSLI-96200. Stanford CA, Center for the Study of Language and Information, Stanford University. A. I. Rudnicky, E. Thayer, P. Constantinides, C. Tchou, R. Shern, K. Lenzo, X. W., and A. Oh. 1999. Creating natural dialogs in the Carnegie Mellon Communicator system. In Proceedings of Eurospeech 1999. Budapest, Hungary. J. M. Sinclair and M. Coulthard. 1975. Towards an analysis of Discourse: The English used by teachers and pupils: Oxford University Press. G. Tür, A. Stolcke, D. Hakkani-Tür, and E. Shriberg. 2001. Integrating prosodic and lexical cues for automatic topic segmentation. Computational Linguistics, 27(1):3157. N. Yankelovich. 1997. Using Natural Dialogs as the Basis for Speech Interface Design. In Susann Luperfoy (Ed.), Automated Spoken Dialog Systems. Cambridge, MA: MIT Press.