Linguistic issues in building dialogue systems Radhika Mamidi IIIT-H Outline Linguistic issues in NLP including Pragmatics Computational Pragmatics Pragmatics Discourse Analysis Conversation Analysis Spoken Dialogue Systems Types, models, domains Comparing human-human vs human-system dialogues Speech Act interpretation Why is Natural Language Processing so difficult? Human language is: Complex and Ambiguous We use language creatively We don’t mean what we say! Language Understanding needs contextual and general knowledge apart from linguistic knowledge. To know what we mean shared knowledge is necessary. Representing all this knowledge computationally is THE challenge. Let’s analyze this spoken sentence: “I made her duck” How many meanings/interpretations? Human language is complex and ambiguous When shot at, the dove dove into the bushes. The insurance was invalid for the invalid. They were too close to the door to close it. The buck does funny things when the does are present. There was a row among the oarsmen about how to row. Upon seeing the tear in the painting I shed a tear. Language understanding: Parsing problem! Gene Autry is better after being kicked by a horse. The women included their husbands and their children in their potluck suppers. Two cars were reported stolen by the Groveton police yesterday. (Steven Pinker. 1994. The language instinct. Morrow. 102.) We use language creatively… Example recommendations: A man like him is hard to find. He's an unbelievable worker. You would indeed be fortunate to get this person to work for you. There is nothing you can teach a man like him. I can assure you that no person would be better for the job. What we say and what we mean A man like him is hard to find. [For a chronically absent employee] He's an unbelievable worker. [For a dishonest employee] You would indeed be fortunate to get this person to work for you. [For a lazy employee] There is nothing you can teach a man like him. [For a stupid employee] Cooperative model: various types of knowledge Eg: The building blocks… (Greene, 1986) Pragmatics Study of how utterances have meanings in situations. (Leech, 1983) Study of how more gets communicated than is said. (Yule, 1996) How people comprehend and produce a communicative act or speech act in a concrete speech situation. It distinguishes two intents or meanings in each utterance or communicative act of verbal communication. Informative intent = the sentence meaning Communicative intent = speaker meaning (Sperber and Wilson, 1995). Pragmatic competence the ability to comprehend and produce a communicative act Includes one's knowledge about the social distance, social status between the speakers involved, the cultural knowledge such as politeness, and the linguistic knowledge explicit and implicit. Topics in Pragmatics deals with relations between linguistic aspects and aspects of context. Conversational Implicature A: Coffee? B: It will keep me awake. Presupposition “I bought this book in Italy last summer” Speech Acts “Why don’t you call Mary?” Deixis “I’d like you to leave that over there and come here now” Discourse Analysis Anaphora resolution John and Mary bought new cars. They are good friends. John and Mary bought new cars. They are 2008 models. Rhetorical relations John fell. Jack pushed. John went to work. He works at IBM. John went to work. He took a taxi. Ellipsis Mary bought a new car. So did Susan. Mary bought a new dress. So did Susan. Conversation Analysis Turn Constructional Component Turn Allocational Component Sequence Organization Adjacency pairs: greeting-greeting, question-answer pairs Pre-sequences Preference Organisation: agreement and acceptance are promoted over their alternatives Repair: who initiates repair (self or other) and by who resolves the problem (self or other) Sacks, H., Schegloff, E. A., & Jefferson, G. (1974) Computational Pragmatics “Computational pragmatics studies, from an explicitly computational point of view, how relations between linguistic phenomena and their context of use govern speakers’ abilities to interpret and generate utterances in conversation” How to compute these relations in terms of explicit representations. . . • given a linguistic expressions, how to compute the relevant contextual properties • given a particular context, how to compute the relevant linguistic expression (Bunt & Black, 2000) Application of computational pragmatics Work on computational pragmatics often takes place within research on dialogue systems. Systems that are able to interact with human users in natural language. Helps us make decisions on how to deal in a computational way with all phenomena related to language use. What is a dialogue system? An artificial agent like robot or a computer system that can interact with human beings. Helps us understand the nature of dialogue and test theories Helps us understanding the collaborative nature of interaction Helps us access information and services more efficiently Uses of dialogue systems Phone-based applications: timetable info or flightbooking Personal assistant: understand user needs and tasks Intelligent tutoring: student engagement Embodied conversational agents – Engagement via realistic and affective physical and facial gestures Intelligent environments: home or car – Understanding user situation and activity Architecture (Mamidi and Khan, 2005) Available intelligent dialogue systems Interactive Voice Systems [e.g. railway enquiry system asking the user to press certain numbers for accomplishing the task] Question-Answering Systems [e.g. START, a webbased QA system answering user's questions on movies, places, people etc.] Natural Language Interfaces [e.g. tutoring systems, trip planning systems] Task-oriented Restricted Domain Easier to implement Ultimate goal to have Dialog Systems that talk like human beings and display intelligence in understanding the complex cognitive structure of language. Dialogue domains Travel information (SUNDIAL, ATIS) Transport (TRAINS) Business Appointments (VERBMOBIL) Car-Navigation Access to on-line information (SUN Speech Acts) Dialogue models Dialogues can be: Non-machine-mediated: ordinary every-day human dialogue analysed by computational means (dialogue data is recorded, transcribed and analysed to build automated systems). Machine-mediated: The computer offers assistance to the participants (VERBMOBIL). Simulated: Both participants are human, but one pretends to be a computer system. Non-simulated: Genuine interaction between human and computer (fully-fledged Dialogue Systems) Types of dialog systems Single initiative system Guide user through a series of scripted prompts. Eg. Telephone enquiry Mixed initiative system Cooperative and flexible responses to user queries. Eg. Restaurant database Sample dialogue: CSLI Dialogue System for in-car devices U: What music do you have? S: I have rock, soul … // aggregation U: What rock songs? Sys: Just a second: you have to turn left at the next intersection // interruption U: OK. S: I have songs by U2, Cake, Green Day, Jet, … // return to previous thread U: Play something by them. // pronominal reference S: OK. Playing xxx. < plays random song by Jet > U: Play American Idiot instead. S: Do you mean the song or the album? // disambiguation U: The album please. // fragment U: I need somewhere for lunch. S: There is a highly rated vegetarian restaurant near here. // knows preferences U: What other cuisines? S: There are Italian, Japanese, Malaysian, Nepali, and Thai restaurants close by. Sample… A: And what time would you like to leave Pittsburgh? (relevant next contribution) C: Uh hmm, I don’t think there’s many options for non-stop A: Right. There’s three non-stops today. Comparing human-human vs computerhuman dialogues Human-Human Hums, grunts, pauses, false starts, hesitations Barge in conversations Elliptical constructions Context is important Computer-Human dialogues Rigid; turn by turn pattern User speaks a word or two Intelligent agent components perception - the agent must be able to perceive the world around it beliefs - the agent must have a representation of the present state of the world desire/wants - the agent should have positive or negative responses to various states of the world, creating a way to compare the desirability of states planning/reasoning - the agent must be able to reason about ways to attain other states commitment - the agent must be able to decide to act to get to a different state intentions - the agent must be able to maintain the course of action decided on acting - the agent must be able to act and thus change its state (Allen,1995) Illocutionary speech acts Searle (1975): Assertives Directives Commissives Expressives Declarations Challenges Speech recognition errors Parsing language in practical dialogue Need to capture what was said Spoken language is not sentence based A single utterance realises a sequences of speech act. Intention recognition Mixed initiative Integrate dialogue and task performance Context-dependent interpretation Dialogue strategies (turn-taking mechanisms) If computers were to speak like us… Recognise intention of speaker A1: Lend me your umbrella. It is cloudy. [Request] A2: Don't water the plants now. It is cloudy. [Warning] A3: It will rain today. It is cloudy. [Assertion] A4: I hope the pictures will come out well. It is cloudy. [Doubt] Make proper inference B1: Did you look at the sentence I sent you to translate. C1: Yeah. It was such an easy sentence! B2: Was it easy? C2: No, I meant it was tough. And… Ellipsis Retaining the logical form of previous sentence. Reconstructing full content. Turn management: determining when the turn is over and who talks next Grounding - acknowledgement, repetition Clarification: question to resolve some lack of understanding Anaphora resolution Speech act interpretation BDI model Cue based model Belief Desire Intention (BDI) model Bunt and Black (2000) define this line of inquiry as follows: to apply the principles of rational agenthood to the modeling of a (computer-based) dialogue participant, where a rational communicative agent is endowed not only with certain private knowledge and the logic of belief, but is considered to also assume a great deal of common knowledge/beliefs with an interlocutor, and to be able to update beliefs about the interlocutor’s intentions and beliefs as a dialogue progresses. Belief Desire Intention algorithm Extremely powerful approach to dialogue act comprehension/speech act interpretation. Uses rich knowledge structures and powerful planning techniques. Addresses even subtle indirect uses of dialogue acts. Incorporates knowledge about speaker and hearer intentions, actions, knowledge, and belief that is essential for any complete model of dialogue. Drawback It requires that each utterance have a single literal meaning, which is operated on by plan inference rules to produce a final non-literal interpretation. Much recent work has argued against this literal-first non-literal-second model of interpretation. Alternative - Cue model Listener uses different cues in the input to help decide how to build an interpretation. The surface input to the interpretive algorithm provides clues to structure-building, rather than providing a literal meaning which must be modified by purely inferential processes. What characterizes a cue-based model is the use of different sources of knowledge (cues) for detecting a speech act, such as lexical, collocational, syntactic, prosodic, or conversational-structure cues. Conclusion Pragmatics is the base of Computational Pragmatics. Dialogue allows to explore novel challenges in language technologies. Understanding human-human dialogue helps in building human-system dialogue. Goal is to build robust dialogue systems for mixed-initiative, multi-domains and multiparty interactions. References Allen, James. 1995. Natural Language Understanding. Menlo Park, CA: Benjamin Cummings. Allen, James, Donna Byron, Myroslava Dzikovska, George Ferguson, Lucian Galescu, and Amanda Stent. 2001. Towards Conversational Human-Computer Interaction. AI Magazine. Allen, James, Donna Byron, Myroslava Dzikovska, George Ferguson, Lucian Galescu, and Amanda Stent. 1998. Natural Language Engineering. Cambridge University Press. Bunt, Harry and William Black (eds.) 2000. Abduction, Belief and Context in Dialogue. Amsterdam: John Benjamins. Greene, Judith. 1986. Language Understanding: A cognitive approach. Open university press. Jurafsky, Daniel, and James H. Martin. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall. Leech, Geoffrey N. 1983. Principles of Pragmatics. London: Longman. References Levinson, Stephen C. 1983. Pragmatics. Cambridge University Press. Mamidi, Radhika and Monis Raja Khan. 2005. Linguistic issues in building Dialog Systems. Presented at The Linguistic Society of India Platinum Jubilee Conference, University of Hyderabad, India. 6-8 December, 2005 Ruslan, Mitkov (ed). 2003. The Oxford handbook of Computational Linguistics. Oxford University Press. Sacks, H, E. A. Schegloff, G Jefferson. 1974. A simplest systematics for the organization of turn-taking for conversation. Language, 50, 696-735. John Searle. 1975. Indirect speech acts. In Syntax and Semantics, 3: Speech Acts, ed. P. Cole & J. L. Morgan, pp. 59– 82. New York: Academic Press. Sperber, D and D. Wilson. 1995. Relevance: Communication and Cognition, 2nd ed. Oxford: Blackwell. Yule, George.1996. Pragmatics (Oxford Introductions to Language Study). Oxford University Press.