Speech Acts as a Basis for U n d e r s t a n d i n g Dialogue Coherence by C. Raymond P e r r a u l t and J a m e s F. Allen Dept. of Computer Science U n i v e r s i t y of Toronto T o r o n t o Canada and P h i l i p R. C o h e n Bolt Beranek and N e w m a n C a m b r i d g e Mass. i. Introduction Each agent m a i n t a i n s a model of the world, including a model of the m o d e l s of other agents. L i n g u i s t i c utterances are the result of the e x e c u t i o n of o p e r a t o r s whose effects are mainly on the models that the speaker and hearer m a i n t a i n of each other. These effects are intended by the speaker to be produced p a r t l y by the hearer's recognition of the speaker's plan. Webster's dictionary defines "coherence" as "the quality of being logically integrated, consistent, and intelligible". If one were asked whether a sequence of physical acts being performed by an agent was coherent, a crucial factor in the d e c i s i o n would be whether the acts were perceived as contributing to the achievement of an overall goal. In that case they can frequently be described briefly, by naming the goal or the procedure executed to achieve it. Once the intended goal has been conjectured, the sequence can be d e s c r i b e d as a more or less correct, m o r e or less optimal attempt at the a c h i e v e m e n t of the goal. This view of the c o m m u n i c a t i o n process is very close in spirit to the AustinGrice-Strawson-Searle approach to illocutionary acts, and indeed was s t r o n g l y influenced by it. We are working on a theory of speech acts based on the notions of plans, world models, plan c o n s t r u c t i o n and plan recognition. It is intended that this theory should answer q u e s t i o n s such as: One of the m a i n s t r e a m s of AI research has been the study of problem solving behaviour in humans and its simulation by machines. This can be considered as the task of transforming an initial state of the world into a goal state by finding an appropriate sequence of applications of o p e r a t o r s from a given set. Each operator has two m o d e s of execution: in the first it changes the "real world", and in the second it changes a model of the real world. S e q u e n c e s of these o p e r a t o r s we call plans. They can be constructed, simulated, executed, optimized and debugged. Operators are usually thought of as achieving certain effects and of being applicable only when certain p r e c o n d i t i o n s hold. (i) Under what circumstances can an observer believe that a speaker has s i n c e r e l y and n o n - d e f e c t i v e l y performed a particular i l l o c u t i o n a r y act in producing utterance for a hearer? The observer could also be the hearer or speaker. (2) What changes does the successful execution of a speech act make to the s p e a k e r ' s model of the hearer, and to the h e a t e r ' s model of the speaker? (3) H o w is the meaning (sense/reference) of an u t t e r a n c e x related to the acts that can be p e r f o r m e d in uttering x? The effects of one agent executing his plans m a y be o b s e r v a b l e by other agents, who, assuming that these plans were produced by the first agent's plan c o n s t r u c t i o n algorithms, may try to infer the plan being executed from the observed changes to the world. The fact that this inferencing m a y be intended by the first agent underlies human c o m m u n i c a t i o n . must A theory specify of speech acts based on plans at least the following: (i) A P l a n n i n g System: a language for d e s c r i b i n g states of the world, a l a n g u a g e for describing operators and algorithms for plan c o n s t r u c t i o n and plan inference. S e m a n t i c s for the l a n g u a g e s should also be given. (2) Definitions of speech acts as operators in the planning system. What are their effects? When are they applicable? How can they be realized in words? * This research was supported in part by the National Research Council of Canada. 125 To make possible a first attempt at such a theory we have imposed several r e s t r i c t i o n s on the system to be modelled. a n a l y s i s of indirect speech acts (such as "Can you pass the salt?") utterances which appear to result from one illocutionary act but can be used to p e r f o r m another. (I) Any agent Al's model of another agent A2 is defined in terms of "facts" that A1 b e l i e v e s A2 believes, and goals that A1 b e l i e v e s A2 is attempting to achieve. We are not attempting to model obligations, feelings ~ etc. Section 2 of this paper o u t l i n e s some requirements on the models which the v a r i o u s agents m u s t have of each other. Section 3 d e s c r i b e s the planning o p e r a t o r s for REQUEST and INFORM, and how they can be used to g e n e r a t e plans which include assertions, imperatives, and several types of questions. (2) The only speech acts we try to model are some that appear to be d e f i n a b l e in terms of beliefs and goals, namely REQUEST and INFORM. We have been taking these to be prototypical members of Searle's "directive" and "representative" classes (Searle (1976)). We represent questions as R E Q U E S T s to INFORM. These acts are interesting for they have a wide range of syntactic realizations, and account for a large p r o p o r t i o n of everyday utterances. Section 4 discusses the relation b e t w e e n the o p e r a t o r s of section 3 and the linguistic sentences which can realize them. We c o n c e n t r a t e on the p r o b l e m of identifying illocutionary force, in particular on indirect speech acts. A useful c o n s e q u e n c e of the illocutionary force identification process is that it p r o v i d e s a natural way to u n d e r s t a n d some elliptical utterances, and utterances whose purpose is to acknowledge, c o r r e c t or clarify interpretations of previous utterances. (3) We have limited o u r s e l v e s so far to the study of so-called task-oriented dialogues which we interpret to be conversations between two agents c o o p e r a t i n g in the a c h i e v e m e n t of a single high-level goal. These d i a l o g u e s do not allow changes in the topic of discourse but still display a wide range of linguistic behaviour. A critical part of c o m m u n i c a t i o n is the process by which a speaker can c o n s t r u c t d e s c r i p t i o n s of o b j e c t s involved in his plans such that the hearer can identify the intended referent. Why can someone asking "Where's the s c r e w d r i v e r ? " be answered with "In the drawer with the hammer" if it is assumed he knows where the hammer is, but maybe by "In the third drawer from the left" if he doesn't. How accurate must descriptive phrases be? Section 5 examines how the speaker and hearer's m o d e l s of each other influence their references. Finally, section 6 c o n t a i n s some ideas on future research. Much of our work so far has dealt with the problem of g e n e r a t i n g plans containing REQUEST and INFORM, as well as nonlinguistic operators. Suppose that an agent is a t t e m p t i n g to achieve some task, w i t h incomplete k n o w l e d g e of that task and of the m e t h o d s to complete it, but with some knowledge of the a b i l i t i e s of another agent. How can the first agent make use of the abilities of the second? Under what circumstances can the first usefully produce utterances to transmit or acquire facts and goals? How can he initiate action on the part of the second? Most e x a m p l e s in the paper are drawn from a s i t u a t i o n in which one p a r t i c i p a n t is an information clerk at a train station, whose objective is to assist p a s s e n g e r s in boarding and m e e t i n g trains. The d o m a i n is o b v i o u s l y limited, but still provides a natural setting for a wide range of utterances, both in form and in intention. We view the plan related aspects of language generation and recognition as indissociable, and strongly related to the process by which agents c o o p e r a t e in the achievement of goals. For example, for agent2 to reply "It's closed" to a g e n t l ' s query "Where's the nearest service station?" seems to require him to infer that agentl wants to m a k e use of t h e service station which he could not do if it were closed. The reply "Two blocks east" would be seen as m i s l e a d i n g if given alone, and u n n e c e s s a r y if given along with "It's closed". Thus p a r t of c o o p e r a t i v e behaviour is the d e t e c t i o n by one a~ent of obstacles in the plans he b e l i e v e s the other agent holds, p o s s i b l y f o l l o w e d by an attempt to overcome them. We claim that speakers expect (and intend) h e a r e r s to operate this way and therefore that any hearer can a s s u m e that inferences that he can draw based on knowledge that is shared w i t h the speaker are in fact intended by t h e speaker. These p r o c e s s e s u n d e r ! ~ e our 2. On m o d e l s of others In this section we present criteria that one a g e n t ' s model of another ought to satisfy. For convenience we dub the agents SELF and OTHER. Our r e s e a r c h has concentrated on modelling beliefs and goals. We claim that a theory of language need not be concerned with what is actually true in the real world: it should describe language processing in terms of a person's beliefs about the world. Accordingly, SELF's model of O T H E R should be based on "believe" as described, for example, in Hintikka(1962) and not on "know" in its sense of "true belief". 126 Want Henceforth, all uses of the words "know" and "knowledge" are to be treated as synonyms for "believe" and "beliefs". We have neglected other aspects of a model of another, such as focus of a t t e n t i o n (but see G r o s z ( 1 9 7 7 ) ) . Any r e p r e s e n t a t i o n of O T H E R ' s goals (wants) m u s t d i s t i n g u i s h such information from: O T H E R ' S beliefs, SELF's beliefs and goals, and (recursively) from the o t h e r ' s model of someone else's beliefs and goals. The representation for WANT must also allow for d i f f e r e n t scopes of quantifiers. For example, it should d i s t i n g u i s h between the readings of "John wants to take a train" as "There is a specific train which John wants to take" or as "John wants to take any train". Finally it should allow a r b i t r a r y embeddings with BELIEVE. Wants of beliefs (as in "SELF wants OTHER to believe P") become the reasons for telling P to OTHER, while beliefs of wants (e.g., SELF B e l i e v e s SELF wants P) will be the way to represent SELF's goals P. Belief Clearly, SELF ought to be able to distinguish his beliefs about the world from what he believes other believes. SELF ought to have the possibility of believing a proposition P, of believing not-P, or of being ignorant of P. W h a t e v e r his stand on P, he should also be able to believe that O T H E R can hold any of these p o s i t i o n s on P. N o t i c e that such d i s a g r e e m e n t s cannot be represented if the r e p r e s e n t a t i o n is based on "know" as in Moore(1977). Level____~s o f E m b e d d i n g SELF's belief r e p r e s e n t a t i o n ought to allow him to represent the fact that O T H E R knows whether some p r o p o s i t i o n P is true, w i t h o u t SELFIs having to know which of P or -P he does believe. S u c h information can be represented as a disjunction of beliefs (e.g., O R ( O T H E R BELIEVE P, O T H E R BELIEVE ~P)). Such disjunctions are essential to the planning of yes/no questions. A natural q u e s t i o n to ask is how many levels of belief embedding are needed by an agent capable of p a r t i c i p a t i n g in a dialogue. Obviously, to be able to deal with a d i s a g r e e m e n t , SELF needs two levels (SELF BELIEVE and SELF BELIEVE OTHER BELIEVE ). If SELF were to lie to OTHER, he would have to be able to believe some proposition P (i.e. SELF BELIEVE (P)), while O T H E R believes that SELF believes not P (i.e. SELF BELIEVE OTHER BELIEVE SELF BELIEVE (~P)), and hence he would need at least three levels. Finally, a belief r e p r e s e n t a t i o n must distinguish between situations like the following: I. OTHER from gate 2. OTHER departure 3. O T H E R the train We show in Cohen (1978) how one can represent, in a finite fashion, the u n b o u n d e d number of beliefs created by any communication act or by face-to-face situations. The finite representation, which employs a circular data structure, formalizes the concept of mutual belief (cf. Schiffer (1972)). Typically, all these levels of belief embedding can be represented in three levels, but theoretically, any finite number are possible. believes that the train leaves 8. believes that the train has a gate. knows what the d e p a r t u r e gate for is. Case 1 can be represented by a p r o p o s i t i o n that c o n t a i n s no variables. Case 2 can be represented by a belief of a quantified p r o p o s i t i o n -- i.e., OTHER BELIEVE x (the y ( ~ GATE(TRAIN,y) = x)) However, case 3 is represented quantified belief namely, x OTHER BELIEVE (the y : GATE(TRAIN,y) by 3. U§in@ a What to Say a Model of the Other to Decide As a n aid in evaluating speech act definitions, we have constructed a computer program, OSCAR, that plans a range of speech acts. The goal of the program is to c h a r a c t e r i z e a speaker's capacity to issue speech acts by predicting, for specified situations, all and only those speech acts that would be a p p r o p r i a t e l y issued by a person under the circumstances. In this section, we will make r e f e r e n c e to p r o t o t y p i c a l speakers by way of the OSCAR program, and to hearers by way of the p r o g r a m ' s user. = x) The formal semantics such beliefs have been problematic for philosophers (cf. Quine (1956) and Hintikka (1962)). Our approach to them is d i s c u s s e d in Cohen (1978). In Section 3, we discuss how quantified beliefs are used during planning, and how they can be acquired during conversation. Specifially, - a 127 the program is able to: Plan R E Q U E S T speech acts, for instance speech act that could be realized by "Please to get action. open the door", w h e n its goal is the user to w a n t to p e r f o r m some Suppose, for example, that O S C A R is outside a room w h o s e door is c l o s e d and that it b e l i e v e s that the user is inside. When planning to m o v e itself into the room, it m i g h t R E Q U E S T that the user open the door. However, it w o u l d only plan this speech act if it believed that the user did not a l r e a d y w a n t to open the door and if it b e l i e v e d (and b e l i e v e d the user believed) that the preconditions to o p e n i n g the door held. If that were not so, O S C A R could plan a d d i t i o n a l I N F O R M or R E Q U E S T speech acts. For example, a s s u m e that to open a door one needs to have the key and OSCAR believes the user doesn't know w h e r e it is. Then OSCAR could plan "Please open the door. The key is in the closet". OSCAR thus employs its user m o d e l in telling him w h a t it b e l i e v e s he needs to know. Plan I N F O R M speech acts, such as one that could be realized by "The door is locked", w h e n its goal is to g e t the user to b e l i e v e some p r o p o s i t i o n . - - Combine the above to speech acts in one plan, act m a y e s t a b l i s h b e l i e f s can then be e m p l o y e d in another speech act. produce multiple w h e r e one speech of the user that the p l a n n i n g of Plan q u e s t i o n s as r e q u e s t s that the user inform, w h e n its goal is to b e l i e v e s o m e t h i n g and w h e n it b e l i e v e s that the user knows the answer. P l a n speech acts i n c o r p o r a t i n g third parties, as in "Ask T o m to tell you w h e r e the key is and then tell me." Mediating Acts and P e r l o c u t i o n a r y Effects The e f f e c t s of INFORM (and REQUEST) are modelled so that the bearer's b e l i e v i n g P (or w a n t i n g to do ACT) is not e s s e n t i a l to the s u c c e s s f u l c o m p l e t i o n of the speech act. Speakers, we claim, cannot i n f l u e n c e their hearers' beliefs and goals directly. Thus, the p e r l o c u t i o n a r y e f f e c t s of a s p e e c h act are not part of that act's d e f i n i t i o n . We propose, then, as a principle of c o m m u n i c a t i o n that a s p e a k e r ' s p u r p o s e in s i n c e r e c o m m u n i c a t i o n is to p r o d u c e in the hearer an a c c u r a t e model of his mental state. To i l l u s t r a t e the p l a n n i n g of speech acts, consider first the following simplified definitions of REQUEST and INFORM as S T R I P S - l i k e o p e r a t o r s (cf. F i k e s and N i l s s o n (1971)). Let SP d e n o t e the speaker, H the hearer, ACT some action, and PROP some p r o p o s i t i o n . Due to space limitations, the intuitive English m e a n i n g s of the formal terms a p p e a r i n g in these d e f i n i t i o n s will h a v e to s u f f i c e as explanation. REQUEST(SP,H,ACT) preconditions: SP B E L I E V E H C A N D O A C T SP B E L I E V E H B E L I E V E H C A N D O A C T SP B E L I E V E SP W A N T TO R E Q U E S T effects: H B E L I E V E SP B E L I E V E SP W A N T H TO ACT To b r i d g e the g a p b e t w e e n the s p e e c h acts and their intended perlocutionary effects, we p o s i t m e d i a t i n g acts, named C O N V I N C E and DECIDE, w h i c h model w h a t it takes to g e t s o m e o n e to b e l i e v e s o m e t h i n g or want to do something. Our current analysis of these mediating acts trivializes the p r o c e s s e s that they are intended to m o d e l by p r o p o s i n g that to convince someone of something, for example, one need o n l y g e t that p e r s o n to know that one b e l i e v e s it. INFORM(SP,H,PROP) preconditions: SP B E L I E V E PROP SP B E L I E V E SP W A N T TO I N F O R M effects: H B E L I E V E SP B E L I E V E PROP The program uses a simplistic backward-chaining algorithm that plans actions when their e f f e c t s are w a n t e d as subgoals that are not believed to be true. It is the testing of p r e c o n d i t i o n s of the newly planned action before creating new subgoals that e x e r c i s e s the program's model of its user. We shall b r i e f l y sketch h o w to plan a REQUEST. Using Quantified Questions E v e r y a c t i o n has "want p r e c o n d i t i o n s " , which specify that b e f o r e an agent does that action, he m u s t w a n t to do it. OSCAR plans REQUEST speech acts to achieve precisely this precondition of actions that it wants the user to perform. Similarly, the goal of the user's believing some p r o p o s i t i o n PROP becomes O S C A R ' S r e a s o n for p l a n n i n g to INFORM h i m of PROP. W h e n such a q u a n t i f i e d b e l i e f is a goal, it leads O S C A R to plan the q u e s t i o n "Where is the key?" (i.e., R E Q U E S T ( O S C A R , USER, INFORM(USER, OSCAR, the y LOC(KEY,y))). In c r e a t i n g this q u e s t i o n , OSCAR first plans a CONVINCE and then p l a n s the u s e r ' s I N F O R M s p e e c h act, w h i c h it then tries to get him to p e r f o r m by way of r e q u e s t i n g . Beliefs Notice that the O S C A R ' s g e t t i n g the key it is -- is of the form: x 128 -- Planning precondition to -- knowing w h e r e OSCAR BELIEVE (the y : LOC(KEY,y) = x) The above definition of INFORM is inadequate for dealing with the q u a n t i f i e d beliefs that arise in m o d e l l i n g someone else. This INFORM should be viewed as that version of the speech act that the planning agent (e.g., OSCAR) plans for itself to perform. A different view of INFORM, say I N F O R M - B Y - O T H E R , is n e c e s s a r y to represent acts of informing by agents other than the speaker. The difference between the two INFORMs is that for the first, the planner knows what he wants to say, but he o b v i o u s l y does not have such knowledge of the content of the second act. The p r e c o n d i t i o n for this new act quantified speaker-belief: x USER BELIEVE (the y : LOC(KEY,y) the speaker intends to perform, but' as is well known, utterances which taken l i t e r a l l y would indicate one i l l o c u t i o n a r y force can be used to indicate another. Thus "Can you close the door?" can be a request as well as a question. These socalled indirect speech acts are the acid test of a theory of speech acts. We claim that a plan-based theory gives some insight into this phenomenon. Searle(1975) c o r r e c t l y suggests that "In cases where these sentences < i n d i r e c t forms of requests> are uttered as requests, they still have their literal m e a n i n g and are uttered with and as having that literal meaning". How then can they also have their indirect m e a n i n g ? is a Our answer relies in part on the fact that an agent participating in a cooperative dialogue m u s t have processes to: = x) where the user is to be the speaker. For the system to plan an I N F O R M - B Y - O T H E R act for the user, it m u s t believe that the user knows where the key is, but it does not have to know that location! Similarly, the effects of the INFORM-BYO T H E R act is also a quantified belief, as in x O S C A R BELIEVE USER BELIEVE (the y .~ LOC(KEY,y) (I) Achieve goals based on what he believes. (2) Adopt goals of other agents as his own. (3) Infer goals of other agents. (4) Predict future behaviour of other agents. These p r o c e s s e s would be n e c e s s a r y even if all speech acts were l i t e r a l to account for e x c h a n g e s where the response indicates a knowledge of the speaker's plan. For example = x) Thus, O S C A R plans this I N F O R M - B Y - O T H E R act of the key's location in order to know where the user thinks the key is. Passenger: "When does the next train to M o n t r e a l leave?" Clerk : "At 6:15 at Gate 7" or Clerk - "There won't be one until tomorrow." Such information has been lacking from all other formulations of ASK (or INFORM) that we have seen in the literature (e.g., Schank (1975), Mann et al. (1976), Searle (1969)). Cohen (1978) p r e s e n t s one approach to defining this new view of INFORM, and its associated m e d i a t i n g act CONVINCE. 4. R e c o g n i z i n @ Speech Speakers expect hearers to be e x e c u t i n g these p r o c e s s e s and they expect hearers to know this. Inferences that a hearer can draw by executing these p r o c e s s e s based on information he thinks the speaker b e l i e v e s can be taken by the hearer to be intended by the speaker. This accounts for many of the standard e x a m p l e s of indirect speech acts such as "Can you close the door?" and "It's cold here". For instance, even if "It's cold here" is intended literally and is r e c o g n i z e d as such, the helpful hearer may still close the window. When the sentence is uttered as a request, the speaker intends the hearer to recognize the s p e a k e r ' s intention that the hearer should p e r f o r m the helpful behaviour. Acts In the p r e v i o u s section we d i s c u s s e d the structure of plans that include instances of the o p e r a t o r s REQUEST and INFORM without explaining the relation between these speech acts and sentences used to perform them. This section sketches our first steps in exploring this relation. We have been particularly c o n c e r n e d with the p r o b l e m of recognizing illocutionary force and propositional content of the utterances of a speaker. Detailed algorithms which handle the examples given in this section have been designed by J. Allen and are being implemented by him. Further d e t a i l s can be found in (Allen and P e r r a u l t 1978) and A l l e n ' s forthcoming Ph.D. d i s s e r t a t i o n . If indirect speech acts are to be e x p l a i n e d in terms of inferences speakers can expect of hearers, then a theory of speech acts m u s t concern itself with how such inferences are controlled. Some h e u r i s t i c s are p a r t i c u l a r l y helpful. If a chain of inference by the hearer has the speaker planning an action whose effects Certain syntactic clues in an utterance s u c h as its mood and the use of explicit performatives indicate what act 129 are true before the action is executed, then the chain is likely to be wrong, or else must be continued further. This accounts for "Can you pass the salt?" as a request for the salt, not a q u e s t i o n about salt-passing prowess. As Searle(1975) points out, a crucial part of understanding indirect speech acts is being able to recognize that they are not to be interpreted literally. achieve a goal which would allow deduction to continue. Consider following example. Passenger Clerk Passenger Clerk : : : : plan the When is the Windsor train? The train to W i n d s o r ? Yes. 3:15. After the first sentence the clerk cannot distinguish between the e x p e c t a t i o n s "Passenger travel by train to Windsor" and "Passenger meets train from W i n d s o r " , so he sets up a goal : (clerk believes passenger wants to travel) or (clerk believes passenger wants to meet train). The planning for this goal p r o d u c e s a plan that involves asking the passenger if he wants one of the alternatives, and receiving back the answer. The execution of this plan p r o d u c e s the clerk response "The train to Windsor?" and recognizes the response "Yes". Once the passenger's goal is known, the clerk can c o n t i n u e the o r i g i n a l deduction process with the "travel to Windsor" a l t e r n a t i v e favoured. This plan is accepted and the clerk p r o d u c e s the response "3:15" to overcome the o b s t a c l e "passenger knows d e p a r t u r e time". A second h e u r i s t i c is that a chain of inference that leads to an action whose p r e c o n d i t i o n s are known to be not easily a c h i e v a b l e is likely to be wrong. Inferencing can also be controlled through the use of e x p e c t a t i o n s about the s p e a k e r ' s goals. P r i o r i t y can be given to inferences which relate an o b s e r v e d speech act to an expected goal. Expectations enable inferencing to work top-down as well as bottom-up. The use of expected goals to guide the inferencing has another advantage: it allows for the recognition of illocutionary force in elliptical utterances such as "The 3:15 train to Windsor?", without requiring that the syntactic and semantic analysis "reconstitute" a complete semantic representation such as "Where does the 3:15 train to W i n d s o r leave?". For example, let the clerk assume that passengers want to either m e e t incoming trains or board d e p a r t i n g ones. Then the utterance "The 3:15 train to W i n d s o r ? " is first interpreted as a REQUEST about a train to Windsor with 3:15 as either arrival or d e p a r t u r e time. Only d e p a r t i n g trains have d e s t i n a t i o n s different from T o r o n t o and this leads to believing that the passenger wants to board a 3:15 train to Windsor. Attempting to identify o b s t a c l e s in the p a s s e n g e r ' s plan leads to finding that the p a s s e n g e r knows the time but p r o b a b l y not the place of departure. Finally, overcoming the obstacle then leads to an INFORM like "Gate 8". 5. Reference and the Model of the Other We have shown that q u a n t i f i e d beliefs are needed in d e c i d i n g to ask someone a question. They are also involved, we claim, in the r e p r e s e n t a t i o n of singular definite noun phrases and hence any natural language system will need them. According to our analysis, a hearer should represent the referring phrase in a s p e a k e r ' s s t a t e m e n t "The pilot of TWA 510 is drunk" by: x S P E A K E R BELIEVE (the y : PILOT(y,TWA510) DRUNK (x)) = x & This is the reading w h e r e b y the speaker is b e l i e v e d to "know who the pilot of TW~ 510 is" (at least p a r t i a l l y accounting for Donnellan's (1966) referential reading). This is to be c o n t r a s t e d with the reading of whoever is p i l o t i n g that plane is drunk (Donnellan's attributive noun phrases). In this latter case, the existential q u a n t i f i e r would be inside the scope of the belief. Our analysis of elliptical u t t e r a n c e s raises two questions. First, what information does the i l l o c u t i o n a r y force r e c o g n i t i o n m o d u l e expect from the syntax and semantics? Our approach here has been to require from the syntax and semantics a h y p o t h e s i s about the literal i l l o c u t i o n a r y force and a predicate calculus-like representation of the propositional content, but where u n d e t e r m i n e d p r e d i c a t e s and objects could be replaced by p a t t e r n s on which certain restrictions can be imposed. As part of the plan inferencing process these patterns become further specified. These existential presuppositions of d e f i n i t e referential noun p h r a s e s give one important way for hearers to acquire quantified s p e a k e r - b e l i e f s . Such beliefs, we have seen, can be used as the basis for p l a n n i n g further c l a r i f i c a t i o n questions. The second q u e s t i o n is: w h a t should the hearer do if more than one path between the observed u t t e r a n c e and the e x p e c t a t i o n s is p o s s i b l e ? He may suspend plan deduction and start planning to We agree with S t r a w s o n (1950) (and many others) that hearers understand referring phrases based on what they believe speakers intend to refer to. 130 Undoubtedly, a hearer will understand a s p e a k e r ' s (reference) intentions by using a model of that speaker's beliefs. Speakers, of course, know of these interpretation s t r a t e g i e s and thus plan their referring phrases to take the a p p r o p r i a t e referent within the hearer's model of them. A speaker cannot use private descriptions, nor descriptions that he thinks the hearer thinks are private, for c o m m u n i c a t i o n . handle p r o m i s e s w i t h o u t first d e a l i n g with obligations, or warnings without the n o t i o n s of danger and u n d e s i r a b i l i t y ? We are c u r r e n t l y c o n s i d e r i n g an e x t e n s i o n of the approach to understanding stories which report simple dialogue. Much remains to be done on the r e p r e s e n t a t i o n of the a b i l i t i e s of angther agent. A simple setting suggests a number of problems. Let one agent H be seated in a room in front of a table with a c o l l e c t i o n of blocks. Let another agent S be outside the room but c o m m u n i c a t i n g by telephone. If S b e l i e v e s that there is a green block on the table and wants it cleared, but knows nothing about any other blocks except that H can see them, then how can S ask H to clear the g r e e n block? The blocks S wants removed are those which are in fact there, p e r h a p s those which he could p e r c e i v e to be there if he were in the room. The goal seems to be of the form For instance, consider the following variant of an example of Donnellan's (1966): At a party, a woman is holding a martini glass which Jones believes contains water, but of which he is c e r t a i n everyone else believes (and b e l i e v e s he believes) contains a martini. Jones would understand that Smith, via q u e s t i o n (I), but not via q u e s t i o n (2) is referring to this woman. (i) Who (2) Who is the woman is the woman holding holding since Jones does not believe about the water in her glass. the m a r t i n i ? the water? Smith S BELIEVE x (x on the green block => S WANT (x removed from green block)) knows Conversely, if Jones wanted to refer to the woman in an u t t e r a n c e intended for Smith, he could do so using (i) but not (2) since in the latter case he would not think the hearer could pick out his intended referent. but our planning m a c h i n e r y and d e f i n i t i o n of R E Q U E S T are inadequate for g e n e r a t i n g "I request you to clear the green block". We have not yet spent much time investigating the process of giving answers to How and Why questions, or to WH q u e s t i o n s requiring an event description as an answer. We c o n j e c t u r e that because of the speech act approach answers to "What did he say?" should be found in much the same way as answers to "What did he do?" and that this parallelism should extend to other question types. The natural e x t e n s i o n of our a n a l y s i s would suggest r e p r e s e n t i n g "How did AGT achieve goal G?" as a R E Q U E S T by the speaker that the hearer inform him of a plan by which AGT achieved G. We have not yet investigated the repercussions of this e x t e n s i o n on the r e p r e s e n t a t i o n language. Thus it appears that for a speaker to plan a successful singular definite referential e x p r e s s i o n requires that the speaker believe the e x p r e s s i o n he finally chooses have the right referent in the hearer's model of the speaker. Our c o n c e p t of mutual belief can be used (as in Cohen (1978)) to ensure that the expression denotes appropriately in all further embedded belief models. This example is p r o b l e m a t i c for any approach to reference where a communicating party assumes that its reality is the only reality. Speakers and hearers can be "wrong" or "ignorant" and yet c o m m u n i c a t i o n can still be m e a n i n g f u l and successful. Finally consider the following dialogue. Assume that S is a shady businessman, A his secretary. A : IRS 6. Further is on the phone. Research S : I'm not here. We believe that speech acts p r o v i d e an e x c e l l e n t way of e x p l a i n i n g the relations between utterances in a dialogue, as well as relating linguistic to non-linguistic activity. Until we better understand the m e c h a n i s m s by which c o n v e r s a n t s change the topic and goals of the c o n v e r s a t i o n it will be d i f f i c u l t to extend this analysis beyond e x c h a n g e s of a few utterances, in particular to non-task oriented dialogues. Fuller j u s t i f i c a t i o n of our approach also requires its a p p l i c a t i o n to a much broader range of speech acts. Here the p r o b l e m is mainly representational: how can we How is A to understand S's utterance? Although its propositional content is l i t e r a l l y false, maybe even nonsensical, the u t t e r a n c e ' s intention is unmistakable. How tolerant does the u n d e r s t a n d i n g system have to be to infer its way to a c o r r e c t interpretation? Must "I'm not here" be treated i d i o m a t i c a l l y ? 131 B ibl log r aphy Allen, J.F. and Perrault, C.R., "Participating in Dialogue: Understanding via Plan Deduction", 2nd National Conference of the Canadian Society for Studies in Computational Intelligence, Toronto, July, 1978. Cohen, P.R., "On Knowing What to Say: Planning Speech Acts", TRII8 Dept. of Computer Science, University of Toronto, 1978. Donnellan, K., "Reference and Definite Description", The Philosophical Review, vol. 75, 1960, pp280-304. Reprinted in Semantics, Steinberg and Jacobovits, eds., Cambridge University Press, 1970. Fikes, R. E. and Nilsson, N. J., 1970, "STRIPS: A new approach to the application of theorem proving", Artificial Intelligence 2, 1970. Grosz, B. J., "The Representation and Use of Focus in Natural Language Dialogues", 5IJCAI, 1977. Hintikka, K.J., Knowled~[e and Belief, Cornell University Press, 1962. Mann, W.C., Moore, J.A., Levin, J.A.; "A Comprehension Model for Human Dialogue", 5IJCAI, 1977. Moore, R.C.; "Reasoning about Knowledge and Action", 5IJCAI, 1977. Quine, w.v., "Quantifiers and Propositional Attitudes", The Journal of Philosophy 53, (1956), 177-187. Schiffer, S., Meaning, Oxford University Press, 1972. Schank, R. and Abelson, R., "Scripts, Plans and Knowledge", 4IJCAI, 1975. Searle, J. R., Speech Acts, University Press, 1969. Cambridge Searle, J. R.; "Indirect Speech Acts" in Syntax and Semantics, Vol. 3: Speech Acts, Cole and Morgan (eds), Academic Press, 1975. Searle, J. R., "A Taxonomy of Illocutionary Acts", Language, Mind and Knowledge, K. Gunderson (ed.), University of Minnesota Press, 1976. Strawson, 1950. P. F., "On Referring", Mind, 132