From: AAAI Technical Report SS-95-01. Compilation copyright © 1995, AAAI (www.aaai.org). All rights reserved. Generativity, Type Coercion and Verb Semantic Classes Patrick SAINT-D~ IR1T-CNRS 118, route de Narbonne 31062 TOULOUSE FRANCE stdizier@irit.fr 1. Aims In this document, we present aspects related to generativity and to polysemyfirom a long-term ongoing research whoseaims are (I) to defmethe conumtsof electronic dictionary that represents and organizes the syntaxand the semanticsof predicative formsso that they can directly be used for NLPapplications and (2) specify and to evaluate mcthods for creating such a dictionmyfromcorpora of various forms, domainspecific thesaurus andpaper dictionaries. Weare developing methodsand tools in information re~eval and automatic construction of abstracts of textswhichrequire the extraction of predicate-argumentstructures (Pugeault et al. 94). A detailed description of syntactically and semanticallyorganised classes of predicates is therefore particularly useful and crucial to us. This type of information is moreover useful in a numberof other applications. Webasically take into account two major sources of knowledge to create lexical entries: dictionary definitions and corpora. Wealso consider thesaurus associated with specific domainswhich makemoreexplicit, even in an incomplete way, certain formsofinformation. Besides creating a dictionary for NIP, our goal is also to show, within the scope of our study, howthe different sources of information interfere, and howmuchof the workcan be automated,at what cost, and what should be left to a humanexpert. The importanceof: these criteria clearly dependson the granularity of the lexicon one wants to ~tttJtln. In this document,we first introduce the waywe have organized and reformulated for Frenchthe verb semantic class systemdefined by B. Levin(Levin 93) for English; then, we showthat verb semantic classes form a very good frameworkfor structuring the verb selectional restriction system and the Qualia structure system. Finally, weshowthat verb semanticclasses is a powerful meansfor organizing and for restricting the different forms of type coercion. In this paper only verbs will be considered. A similar workis under wayfor predicative nouns. In (Grimshaw90) relations betweenthe argument structure ofverbs andtheargument structure oftheir con’- 152 -cspondingnounsare studied and provide us with precise hints on the type of result we should obtain. The preposition system has been studied separately. The results presentedhere are essentially basedon the French verb class system; for the stoke of readability, wehave translated into English the namesof these classes and somerelated dst~ 2. The verb semantic class system This workis primarily based on B. Lcvin’s work (Levin 93) for English, where she shows that the syntactic behavior of verbs is essentially predictable from some aspects of their semantics. By syntactic behavior, she meansthe wayarguments are distributed in a syntactic form with respectto the predicate, howthey can move (e.g. to defineergative or passiveforms)andwhenthey canbe deleted. Thesemovements anddeletionsare called alternations. Although B. Levin’s work corresponds to a certain level of granularity in the linguistic description which has a certain degree of stability, we do not think that it really introduces a newlevel with a theoretical status in syntax or in semantics. Rather, we consider this workas a very useful, practical and relatively comprehensive one, which can form a goodperspective and a goodpractical basis for the extractionand for the organizationof lexical d~m 2.1 Alternations revisited: verb class system a more declarative Wehave substantially reformulated B. Levin system in order to avoid a numberof difficulties inherent to her approach, and to makethe system more declarative and more ~ble in NLPsystems. Wehave also added several types of semantic information at both the class and the verb level, namelythematic grids and fragments of the Lexical Conceptual Structures (LCS) (Jackendoff described in (Dau~ze, Saint-Dizier, Marrafa 95). Our research is applied so far to French, but the techniques can be used for other languagesin a similar manner. Instead of consideringthe alternation systemof B. Levin, we havereformulatedthis notion into a moredeclarative one: we consider possible syntactic contexts for verbs. Then, we can assign to each verb-sense the set of syntactic contexts it myhave. By syntactic context, we meana subcalegorization frame where the categoryand someadditional syntactic andsemanticinformationare usedto describethe natureandthe formof the possible complementsand the subject of a verb (see examples below). Verb classes are then formed out from verbs having similar sets of contexts. Comparedto B. Levin system, ore" approachavoids having to define a basic form from which alternations areproduced;moreoverit avoids us to have to account for the changesin meaningpmvoqued by alternations (e.g. by the deletion of an argumentor by the adjanction of a preposition). It is also easier to specify very pre¢i~Jythe form and the contents of every element in a context. Contents should however remain quite general in order to avoid an explosion in the numberof contexts. Inparticular, selectional restrictions andthe specification ofprelx~itions mustremain general. More refined contexts canbespecified at theverblevel, whenever necessary, provided thattheyremain coherent withthecontexts specified attheclass leveL Thelevel of generalityconsiders!at the classlevel is defineda priori from data commonlyagreed upon m be of general purpose (e.g. feature values suchashuman,animate and object). Similarly, very limited exceptions can be accepted in order to preserve the homogeneityand the ’completeness’of the class(i.e. verbs with very similar meanings should a priori belong to the sameclass). As shall be seen below, the context system that we have and implementedprovides us with a very powerful tool for specifyingthe syntaxand the semanticsof verbs. Hereis anexample in Pmlog of a verbclass: class( [202,914,51,33,1212,1112], %lists of contexts(see below) [abandonner, accorder,acheter,adjuger,allouer, assigner,attribuer,ceder, choisir,conceder, confier, decerner.distribuer, donner,octroyer, offrir, preter,procurer,prodiguer,promettre, racheter, remettre,transmettre], [[ae,th,tib],[ae,th,tiv]]). %list of thematicgrids % exceptions: procurer +32, abandonner -914 Thematicroles are those definedin (Pugeaultet al. 94), i.e.: ae (effective agen0, th (theme), tib (beneficiary incrementaltheme), tiv (victim incrementaltheme). Contexts:. (for the sake of readability, only positive feanm~are mentioned) 202: <np>, <Verb>, <np>, <pp(Ixep=a)> 914: <np(+agent)>, <Verb(+refl)>,<np> 153 51: 33: 1212: I112: <rip>, <Verb>,<pp(prep=a)>,<np> <rip>, <Vexb>,<np> (pp not specified) <np>, <Verb(+pa.~ve)>, < np>,~)> <np(+theme)>,<Verb(+refl)>,<np>. Fromthe point of view of polysemy, we feel that a verbclass permits usto det’mesomeof the syntactic and semanticboundariesof a verb-sense. A verb in a class corresponds to a unique sense, with a set of syntactic realizations. This appr,~___h providesverb-senses with a certain stability, evenif the different contexts they may have could alter to someextent their meaning. These possible alterations define a family of ’variations’ around the central sense of the verb. Different senses having a priori different syntacticrealizations, they will be in differentclasses. So far, we have defined semantic classes for about 2000 verbs for French (Dau~ 94). These classes have been defined by a combination of a manual and an automatic analysis of texts, technical corpnsesand dictionary defmitlons. Theseclasses havebeendefined on the basis of 48 different contexts. 3. Type selection, Qualia verb semantic classes structures and The well-known,basic idea of the GenerativeLexiconis to avoid long enumerations of possible arguments for predicates and to developa general type shifting system, basedon the notion of Qualia structure (Pnstejovsky 93). These enumerations indeed hide the fundamental distinction betweenbasic and derived meanings. In the Generative lexicon, the different aspects (or facets) of the meaningof a word (andof its asso~tted type(s)) are summarized in a structure called the Qualia structure, composed of fourmainroles(whichcan possibly be further specialized):the formal, constitutive, telic and agentiveroles. To eachverb (andmore generally, to eachpotential governor) are associated one or more subcategorization frameswhichindicate the type of the argumentsthe verb selects. In this document,we have directly adoptedthe Qnalia systemdef’med by James Pustejovsky.However,other, moreor less derived, forms ofQualia sm~:mm; could alsobeconsidered. Type coercion can be viewedas a relation betweena predicate and one of its arguments,where the’argument does not fit exactly the subcategorizadonexpectationsof the verb (neither directly nor by type subsumption). that case, the type of the argumentmust be coerced to another type, coherent with the subcategorization expectations of the verb, for the sentence to be well- formed. Thisnewtypeassociated withtheargument is derived from oneofthepredicates pre,sent ina Qualia role oftheargument. Determining sucha newtypeiscalled qualia: agentive: Pmd(Y,X): ’future having’. wherefuture havingverbs include verbs such as: type otmp~s. This operation hasreceived ano0eratioual acmrder,garantir, assigner, adjuger, attribuer, ceder, semantics inlogic programming in(Saint-Dizi~ 95). conj~r,d,~raer, e~. (~a:,guarantee, assign,cede, etc.). Themf~ge: in a construction suchas: 3.1 Factoring out selectional restrictions acceotathe/oan. it will possible, via type coercionto derive an appmprLste Verbsemantic classes canbeusedtofactor outthetype type ’future having event’ from ~e agentive role of/oan selected byeveryverbintheclassforeachof their and to paraphrasethe aboveVPfor exampleas follows: arguments. Forexample allthe verbsof thempecma/ acceptsWgranOguarantee/as~gn/.., theloan. class select foranobject oftypeevent, possibly further subdivided intoprocess ortransition, depending onthe syntactic formof theverb(Raising or Control, see 4. Type coercion and verb semantic (Pnstejovsky andBouillon 94)). Similarly, all theverbs classes oflheamuse class select fora subject oftype event. 4.1 Introduction Verbsemantic classesallowus to go beyondthe Mpoctual typesas exemplified aboveandanytypeof selectional restriction canbefactored outattheclass level. Themlectiomdrestrictions associated with eachof theargumems ofthe~’an~er ~passes.~n verbclassare forexample: human fortheexternal argument, object for thedirect object andhuman forthethird argument. 3.2 Specifying Qualia roles semantic classes by means of verb Semanticverb classes canalso be usedto specify the contentsof QunlLsroles. Indeed,in mostcases,a whole familyof verbs, semanticallyrelated, canappearin a role instead of just oneelementof the cla=~: For example,in the case of book, instead of just having the predicate wr/te(6:rke)in theagendve roleof theQuaILS, orinstead of having the wholelist of possible verbs, it is more convenient,and morelinguistically appropriate, to refer mverb cLssse&wheneverpossible. As a consequence,for the agentive Qualia role of book we have the class of Communicationby message verbs (including verbs such as: mb~er, ~oire, r~diger,corriger, annoter,t~l~graphier, ma~’,lranser/re, etc.). This can be noted in the QuaiLs strucun~as follows: book(X). qualla: agentive: Pmd(Y,X) : ’communication message’ where Pred is a variable typed ’communication by message’. All the predicatesof this classbearthe same type (e.g. event, or communication event); this makes type shifting simpler andmoregeneral since the type of the cla~¢ is consideredinstead of the type of eachverb in the fla~q= Similarly, the QuaiLsstructure of the agentiveQualia role of the noun/oanis of the form: loan(X) 154 Dependingon the semanticclass the predicate belongsto, and for each argument, only someapplications of type toe.ion ate linguistically appropri~3_~This is what we call in (Saint-Dizler 95) the appropriateness function whichrestricts thescope of type shifting. For example, in a sentence like begin a novel only the telic and the agentive roles of novel ate relevant Begin is indeed an aspectual verb and it basically selects an object of type event. This consideration can be dealt with a certain degree of generality usingthe verb semantic classes. The exampleof begin can indeed be generalized to all verbs of thesameclass (such as cease, commence, end, finish, repem, thebegin verb class of B. Levin). Letus nowcharacmtize in a moregeneral mannerthe relations betweenverb classes and type coercion. Wemodelthe appropriateness function by meansof qualia role selectional constraints. Let us integrate appropriatenessin a wayquite similar to appropriateness in type feature systems (TFS).LetAppbe a partial function mapping thesetofshifting operations Y-l(ti) defined fromtypeti of theargument A to themost general set of types A maybe coerced to with respect to the semanticctAtq s of the predicate P. Let Qbe the set of all QuaiLsstructures, then if S is the set of all semantic classess,thenthe appropriateness function is an application from T x Q x S into T. Let us note the appropriatene~function as follows: r’l(s,ti) = App0iI(ti), s) ~ ~:l(ti) Appselects aliases in Y.l(ti) for which coercion appropriate. The appropriateness function can be applied recursively whenseveral type shifting operations are _ne¢__~. It is then recursively defined from the appropriateness set of the preceding level and the semantic type si of the word being considered at this r2(ai:i) = r2(si, rkspt’i)) and.more generally:. r,%.~)-r,,(~, r"-k.., r~(s~,t’0)). The ~sms~ionof I "~ is monotoneincreasing. The set of all _l~t_~’ble ~ types is noted r(s,q). This defines the denotational semanticsof type shifting in a mere rmuic~ manner since we have: r(s,~) ~ ~(q). l~jpo co.ion is d~u~! o. this ~a. 4.2 Restrictions on type coercion application and verb semantic classes Afirst result of our investigationsis that all the verbs of a given class are subject to the same set of type c~m~ions.By ’type of coercion’ we meancoercion on the agenfive, relic, formal or constitutive roles of the argument There arevery fewexceptions in French which areexpressions whichareeithertotally or partly inconect. Letusconsider thede/ay verbclass, composed oftheverbs: ajowner, remettre, repousaer andretmlr. All these verbs admit thefollowing typecoercions: - coe~ionon the relic or agentive roles of the subject argument" La nei&e retarde le train -> la tombe~ede la nei&e retarde le train (the (falling of the) snowdelays the train). Similarly, we can say: Laneige ajourne/repousae/remet lardunion. However,the use of remeareis slightly less usual than the three other verbs in this context. - cocoonon the constitutive or formal roles of the subject argumentLa banquerepousse/remetlajourne/retarde la re!union-> ie directeur de la banquerepousae/remet/ajourne/retarde la rcfunion this array shows, for each verb class, the type of coercions which are appropriate. For example, for the f’wst class (the grant verb class), the subject (S) and second object (02) arguments can be subject to type coercionon either their constitutive or formalroles. The relic andtheagentive roles are relevantfor the objectl. In this array, we can fbst notice that almost any class of verb accepts type coercion on the constitutive or formal role ofthe subject argumentandonone ofthe object arguments((31 or O2). This is not very surprising since it is very frequent to use a part of an object or a moregeneral term to refer to that object. Forexamplean institution is used for its members,its staff or its manager(typical cases are e.g. the following classes: grant, read, talk, report, ask anddebate). However, this type of coercion is naturally not possible whenthe semantics of the predicate requires the argumentto precisely indicate whatisat stake. This is the case for verbs of the followingclasses: forbid, save, differentiate andrefund (for theobject(s) argument(s)). Next,wecanalso noticethat verbshavinga certain degree of asl~nmllty can have a coercion on the tefic or agentiveroles of their object argument.Severalclasses of verbs are concerned.This situation is due to the fact that a numberof verbs select an object NPwhichis either an object or an event or state. This kind of ambiguityallows simple type coercion from the object type to the event type. This is the case for exampleof the followingverb classes: report, ask, debate.forbidandorganize. Thesame situation occurs for subject argumentsof a numberof other verbs such as: hasten, delay, benefit, save, change andimorove. Finally, and moregenerally, we can notice that there are basically two families of type coercion: coe~ious involving agentive or relic roles and coercion involving formal or consfitudve roles of the verb’s arguments. Thesetwofamilies correspondto substantially different forms of metonymies. (the (director of the) bank postpones/delays.., meeting). - coercion on the constitutive or formal roles of the object: Jeanreoousae/remet/ajourne/retarde far,union-> Jeanrepoasselremetlajoarne/retarde la datede la rdunion (John postpones the(date ofthe)meeting). - coercionon the telic or agentiveroles of the object: Jeanreoousselremet/ajoarne/retarde lad~cision-> Jean repouaselremetlajourne/retarde la prise de la d~c/.ffon. (Johnpostponesthe (taking of the) decision). Let us nowconsider the array given in the annex. This array thus def’mes in extension the appropriateness function introduced above. Ona sample of verb clas~, 155 To concludethis section let us considera fewexamples that will illustrate the array givenin the annex.Verbsof the amuseclass accept type coercion on the agentive or the relic roles of the subject argument as in: Le clownamuseles enfants whichshould be read as: Leaacres da clownamusentlesenfants (The clown amusesthe children -> The actions of the clownamusethe children). Put verbs accept type coercion on the constitutive or formal role of their object argument (or containercontaineeif there is sucha role): for example: mettre l’eau sur la table -> mettre lacarafed" eausur la table (Put the water on the table -> put the jug of water on the table). Wehave the same phenomenonfor e.g. rermTving and p~h-p~! verbs. tocontext variations, consider theabove example: Le clownanu~lesenfants-> L~: enfants sontam~s par/edown(passive alteration). (the clownamusesthe children) This 1~,~ sentencestandingfor:. 4.3 Type coercion and context variation Lesenfan:s sontamus4s par/e.s ac~ns duc/own. (Children areamused bytheactions oftheclown). Another important result is thatalternations donota Fromduit point of view, we mayconsider that this lnn~ priori preserve thepossibility ofusing typecoercion on type of coercion is moresuperficial (in tamsof theagentive or telicrolesof movedargument(s). betweenthe original and shifted types) than the coercion Reformulated inourapproach basedoncontexts instead basedon telic and agentiveroles. Besidesalternations, it ofaltm’nafions, thismeans thattypecoercion doesnot shonldbe noticed that there are other symacticeffects that applytmiformly to anycontext of a givenverb.The mayalso limit the application of type coercion. results given intheannex arebased ontheformwhich is commonly admitted tobe the"basic’ form,orthemost usual formofthepredica~ 5. Conclusion Therefore, sentences which are well-formed because typo coercioncan be applied on their "msic’ form, mayno longer he ~__~-ptablein other contexts. This is the case, for example,in: Jeanconunencesot livre. (John begins a book), standing for:. Jeanconmtencel’~crimred’un liw’e, but * Un livre commence is illformed, whereas: L’~cri~re d’~ ~ commence is c~Tecc The aboveexampleshowsthat type coercion on either the telic or the agentiveroles of the object argumentof the verb maybe no longer possible whenthat object is moved in subject position (See (Pnstejovsky and Bouillon 94) for a study of the semantics of Begin). We cannot howeverdraw any general conclusion from the aboveexample.. The relations betweencontexts and the possibility of applying type coercion seemto be very complex and related to ’deep’ semantic issues of predicates and to the relation they have with their arguments.It is, for example,perfectly correct to say: Le /ilm cow~nce, wherea~. * Le livrecommence. This difference maybe due to the fact that the projection of a film is a well-defined event, with a precise beginning, thnfion and end, whereasfor a bookwe don’t knowexactly what kind of event it is related to and what its exact duration is. According to (Pustejovsky and Bouillon 94) the rejection of a form like Le livre commence is due to the telicity of the event taken by the complement(book). This hypothesis sounds reasonable, but still remainsto be characterizedin moredepth. 33"pc coercion on the formal or constitutive roles seems to be less sensitive to context variations. In particular, type coercionon subjects are not sensitive to context variations. Similarly, type coercion on agentive or relic roles of the subject does not seemto he sensitive 156 In this document,we have proposed a more declarative reformulation of verb semantic classes as defined by B. Levin, based on the notion of syntactic contexLWehave applied these ideas to the construction of a lexical knowledgebase of about 2000verbs in French. Next. we have shownthat the notion of semanticclass is a very strong criteria for organizingthe semanticsof verbs (and of predicates, moregenerally). Verbsen-,antlc classes canalsobeusedto specify the contents of Qualia roles in the Generative Lexicon and to restrict the application of coacionto appmpriatccases, linguistically motic~ted. This paper contributes to the specification of what a word-seuseis and howits boundariescan be practically defined. Thisspecificationallowsfor a certain flexibility in the definitionof a ’sense’; it also specifiesits diffm~,nt syntactic uses. Finally, this workis a contributionto the definition of Qualiastructures andto the practical use of type shifting on a large scale. Defining precise methodsfor writing Qnalia structures on a large scale and for controlling the power of type shifting remain indeed largely open problems.Theydeserve a lot of attention since they are probably two of the main comerstones of the Generative Lexiconprinciples. Their study is a necessarycondition to the practical use of the GenerativeLexiconin ’real" applications. Acknowledgements I amvery grateful mJamesPustejovsky, Federica Busa, and PalmiraMarrafafor discussions on previous versions of this work, which helped improvingthe ideas and the results presented here. Bibliography Pugeault, E. Suint-Dizier, P., Monteil, M.G., KnowledgeExtraction from Texts: a method for extractingpredicate-argument structuresin texts, in prec. Coling94. Kyoto.1994. Pustejovsky.J.. The GenerativeLexicon, journal of Computational Linguistics, 1993. Pustejov~cy.J., Bouillon.P., Onthe properrole of coercion in semantic typing, inpro¢. Coling94, Kyoto,1994. Saint-Dizier, P.. A DenotationalSemanticsfor Type Coercionin the Generative lw.xicon,undersubmission, 1995. Dau~ze, S.,Uneclassification s~nantiquo desverbes dufrancuis bas~ surlaclassification deB.Levin, research ret~xt, IRIT, 1994. Daub~ze, S,,Saint-Dizier, P.,Marrafa, P., Constructing a knowledge baseforrepresentin 8 the &enera/semant/cJ of verbs, undersubmission, 1995. Jeckendoff,R., Seman~c Structures,MITPress, 1990. Grimshaw, J., Argument Structure, M1TPress, 1990. Levin, B.. English Verb Classes andAlternations, the University ofChicago Press. 1993. Sampleverbs of the class considered (,gram) accorder, acheter r ass/xner r a//ouer r aar/buer (Fovide) fournir,livrer, prac~er,servir ( ~ ) accept, acque~, recevoir,aaendre, ob~nir (withdraw) enlewr r extraire,~terr placer r prele~r {read)life, d~clamer, dicter,narrer,pr~cher, r~citer {talk) bamrder~r.ow/r~c~’, papoter, ar~rumenter (refxxt) annoncer, conmumiquer r rapporter,reveler {ask)demander, qmbnander, implorer,soUiciter ( debe~ ) dt~tre, disc~r,marchander, n~~ocier (orsa~.e)or/~aniser,renouveler r s~ucturer (amuse) amuser, charmer, absourdir, captiver (please) pre, pre,a,er, Telic role Ol Agentive role Ol Ol Ol Ol Ol 01, O1 Ol Ol Ol Ol Ol Ol Ol O1 S S Ol S, Ol S~Ol S~ Ol S S S S S (admire) admirer adorer, aimer envier, hal~, ~n/.rer Ol r r ST O1 (lmtm)accdlerer r ac~,er,hater,presser (_d~)aiourne SI Ol Grer~ure,repousser r retarder q~nefit) blur, profiter ST O1 S {conln’bute) compter, contr/buer,influer, d~pendre (save)Ipargner,rentab//iserr ~conom/ser. amort/r (change) c~ger,transformer~ m~tamo~hoser S [unpmve) aO’ermir arMliorer, redresser S , amoindrir, (cftfferen~,..) differencier, discriminer, distin~uer (refund) ~dommager, indemniser,rembourser, payer Constitutire role ST 02 ST 02 ST 02 S, O1~02 S, Ol S, Ol S, 02 ST 02 S, 02 S S T Ol S, Ol S, O1 S T O1 S, Ol S, O1 S T Ol S, O1 Formal role S, 02 S~ 02 $02 S~ O1~02 S, Ol St Ol St 02 ST 02 S, 02 S S~ Ol S~ Ol St O1 ST O1 S~ O1 S, O1 S~ O1 ST O1 Ol Ol S S Ol Ol S S Annex: Sampleof verb classesand appropriateness of type coercion w.r.t. Qualia roles of the arguments 1st column:between backersis an Englishtranslationof the mostrepresentativeverbof the class 2-Sthcolumns:S = subject,O1= objectl, 02 = object2of ’basic’ forms. 157