From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. Robust Natural Language 1from Large-Scale Generation Knowledge Charles B. Callaway Bases James C. Lester (lester~adm.csc.ncsu.edu) (theori st@cs. utexas, edu Department of ComputerSciences Department of Computer Science of Texasat Austin The University North Carolina State University Austin,TX 78712-1188 Raleigh, NC 27695-8206 Abstract Wehavebegun toseetheemergence oflarge-scale knowledge bases thathouse tensof thousands offacts encoded inexpressive representational languages. Therichness ofthese representations offer thepromise ofsignificantly improving thequality ofnatural language generation, buttheir representational complexity, scale, andtask-independence posegreat challenges togenerators. Wehave designed, implemented,and empirically evaluated FARE, a functional realization systemthat exploits messagespecifications drawnfrom large-scale knowledgebases to create functional descriptions, which are expressions that encodeboth functional information(case assignment)and structural information (phrasal constituent embeddings).Given a messagespecification, FAREexploits lexical and grammatical annotations on knowledgebase objects to construct functional descriptions, whichare then converted to text by a surface generator. Twoempirical studies--one with an explanation generator and one with a qualitative modelbuilder--suggest that FARE is robust, efficient, expressive, and appropriate for a broad range of applications. 1 Introduction In recent years, the field of natural language generation has begunto mature very rapidly. In addition to encouraging results in the form of specific theories and mechanismsthat address particular generation phenomena,the field has witnessed the appearance of sophisticated, generic surface realization systems [12, 2]. A parallel of large-scale development in the knowledge representation community has been the emergence knowledge bases (LSKBs)that house tens of thousands of facts encoded in expressive representational languages [14, 8]. Because of the richness of their representations and the sheer volume of their formally encoded knowledge, LSKBsoffer the promise of significantly improving the quality of XSupport for this research is providedby a grant fromthe NationalScienceFoundation(IRI-9120310), a contract from the Air ForceOffice of Scientific Research{F4962’0-93-1-0239), anddonationsfromthe Digital Equipment Corporation. This research wasconductedat the Universityof Texasat Austin. 24 BISFAI-95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. natural language generation. However, the representational complexity, scale, and task-independence of LSKBspose great challenges to natural language generators. The objective of our research is to develop techniques for expressive, robust natural language generation that can take advantage of these developments in surface realization and large-scale knowledge 2base construction. To this end, we have designed, implemented, and empirically evaluated FARE, a functional realization system that exploits message specifications drawn from large-scale knowledge bases to create functional descriptions [2, 3], whichare expressions that encode both functional information (case assignments) and structural specification, information (phrasal constituent embeddings). Given a message FAREconstructs functional descriptions, which are then converted to text by the FuF surface generator [2, 3]. Wehave conducted these investigations in the "laboratory" provided by the Biology KnowledgeBase Project. The result of a seven year effort, the Biology KnowledgeBase [14] is an immense, task-independent representation of more than 180,000 facts about botanical anatomy and physiology. To study FARE’srobustness and range of applicability, it was evaluated with two very dif- ferent text planners: an explanation generator, KNIGHT [10, 11, 9], and a qualitative model constructor TRIPEL[15], which was augmented to produce text plans. 2 Functional Realization Classically, natural language generation hasbeendecomposed intotwosubtasks: planning, determining thecontent andorganization of a text,andrealization, translating thecontent to natural language. Realization canbe decomposed intotwosubtasks: functional realization, constructing functional descriptionsfrommessage specifications supplied by a planner; andsurface generation, translating functional descriptions to text.Ourworkfocuses on thedesign andimplementation of functional realizers, which translate message specifications intofunctional descriptions thatencode caseassignments andphrasal constituent embeddings. Figure 1 depicts a sample functional description (FD).Thefirstline,(catclause), indicates whatfollowswillbe sometypeof verbalphrase. Thesecondlinecontains thekeyword proc,which denotes thateverything in itsscopewilldescribe thestructure of theverbal phrase. Thenextstructure comesundertheheading partic; thisis wherethethematic rolesof theclause arespecified. In this instance, onethematic roleexists in themainsentence, theagent,whichis further defined by its lexical entry and a modifying prepositional phrase indicated by the keyword qua3.£f£er. The structure beginning with circum describes a subordinate infinitival A functional realizer purpose clause. must create a mapping between the content units in a message specification and the constituents and semantic roles in functional descriptions. To properly realize the frames and 2Functional AssignerofRoleEmbeddings. Callaway 2S From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. ((cat clause) (proc ((type (partic material) ((agent (lex ’’reproduce"))) ((cat common) (lex ’’spore’’) (qualifier ((cat pp) (prep === ’’from’’) (np ((cat (lex common) ’’cell’’) (classifier ((cat noun-compound) (classifier === ’’msgaspor@’’) (head wl= ’’mother,~))) (qualifier ((cat pp) (prep=== ’’in’’) (np ((cat common) (lex ’’eporaneium’’))))))))))))) (circum ((purpose ((cat clause) (position end) (keep-for (proc ((type material) (lex (partic ((agent ((semantics (affected ((cat (time-relater =ffi= ’’during ’’form’’))) {partic agent semantics}))) common) (lex (classifier no) (keep-in-order --- ’’gamete") ’’plant’’) (cardinal ((value 4) (digit male gamstophyte generation’’)) (definite (describer no) -m== "haploid’’) no))))))))))) Figure i: A Functional Description relations ina message specification, a functional realizer mustprovide five central functionalities: 1. Assign case roles to content units: To achieve semantic equivalence, a functional realizer must ensure that relations in the messagespecification are precisely mapped to case roles in an FD, e.g., ancestor-cell in Figure 2 should be mappedto agent. 2. Organize content units into embeddedphrase structures: A functional realizer must guarantee that complex content units are are properly packaged in FDs. For example, (Megaspore-MoCher-Cell concained-in Sporangium)in Figure 2 should becomean indivisible syntactic unit. 3. Provide local semantic information: A functional realizer is responsible for detecting andresolving semantic conflicts betweenlexical information and local semantic information on a messagespecification, e.g., whenconstructing an FDfor a creative sentence, such as Figure 2 calls for, it must knowthat infinitival objects should be indefinite rather than the default value of definite. 4. Control the inclusion of selected features: Messagespecifications frequently contain meta-level information, e.g., Include-During-Clause¢. specifies that a time-relater should be included if there are no lexical redundanciesbetween it and the head verb. 5. Abort generation of sentences with defective messagespecifications: A functional realizer is responsible for detecting missing case roles requiredfor properrealization. 26 BISFAI--95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. Specification--493 specification-type: Black-Box-Process-Description viewpoint-of: Male-Gametophyw-Generation reference-concept:Reproduction include-during-clause?:True descendant-cells: Plant-.-Gamcte cell-type: Haploid number-of-units:4 ancestor-cell: Spore sowrc¢: Megaspom-Mother-Cell contained-in: $porangium Figure 2: A SampleMessage Specification A Robust Functional Realization System 3 We havedesigned andimplemented a functional realization system, FARE3 (Figure 3),whichprovides thefivekeyfunctionalities discussed above. Givena message specification produced by a textplanner, FAREusesitsknowledge of casemappings, syntax, andlexical information to construct an FD,which it passes to FUF,a unification-based surface generator [2,3].4 FAREwasdeveloped withtextplanners thatemploy theBiology Knowledge Base[14],whichcontains morethan180,000 factsaboutbotanical anatomyand physiology. We have employed two textplanners: an explanation generator, KNIGHT [10,11,9],anda qualitative modelconstructor TRIPEL [15]thathasbeen"linguistically augmented." 3.1 Knowledge Sources FARE employsthree principle knowledgesources: a lexicon, a library of FD-Skeletons, and a library of semantic transformations. Each concept in the lexicon mayinclude several lexical features, and all lexical entries include the lezical type of the concept, whichindicates either a specific grammatical constituent, e.g., NP, or a subclass of constituents, such as relative clauses. In addition to lexemes and type information, the lexicon provides features such as mass-or-count?to store countability information that overrides default values. To create exceptions in the lexical ontology, the lexical access methods exploit inheritance mechanisms to find the most specific verb available, e.g, "elongate"instead of ~grow." 3FARE’s implementation consists of approximately 5,000linesof LucidCommonLisp. 4FI:Fis accompanied by an extensive, portable Englishgrammar, SURGE(thelargest"generation" grammarin existence), described by Elhadad as "theresultof fiveyearsof intensive experimentation in grammar writing." SURGEborrows the notionsof featurestructures and unification fromfunctional unification grammars [7] and systemsfromsgstemic grammars [5]. CaHaway 27 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. FormativeRealizer (FARE) Message Specification ~ (KNIGHT, TRIPEL) Skeleton un Phrase Generator.[ Text l t Surface .Generator .[ Descriptions (Fu~ Figure 3: An Architecture forFunctional Realization Thesecond knowledge sourceusedby thefunctional realizer is a library of FD-Skeletons. An FDSkeleton consists of (1)a collection of semantic testsforinclusion andmodification of nested FDs,and (2) a template thatencodesthedeepstructure represented by an FD. Amongthe moreimportant various typesof FD-Skeletons arethosefordescribing processes. Currently we havedeveloped FDSkeletons forover20 processes thatplaya central roleinbiology, e.g.,assimilation, development, and reproduction. Figure 4 showsthetemplate of theFD-Skeleton forproducing functional descriptions of s transportation processes, Following in theNLG"revision" tradition [16,17,4, 10],Thethirdknowledge source usedby the functional realizer is a library of semantic transformations. For example, whengiven the triples (Water amount 4) and (H+ amount 3), without access to semantic transformations, the system would return the FDs representing the strings "4 portions of water" and "3 portions of hydrogen ion." Semantic transformations detect and correct problems of this sort. In this example, they determine that amount behaves differently for mass and count nouns, and they appropriately return "3 hydrogen ions" in the latter case. 3.2 FD-Skeleton Retrieval and Processing TheFD-Skeleton Retriever obtains theappropriate FD-Skeleton by indexing intoitslibrary. If thetopic of thespecification is a process, theFD-Skeleton Retriever exploits thetaxonomy of theknowledge base to locate themostspecific casestructure thatcanbe used.Otherwise, theretrieval process requires SNotethatthetemplate is merelyoneof two~omponents of FD-Skeletons; Skeletons alsoinclude semantic testsfor inclusion andmodification of nestedfunctional descriptions. 28 BISFAI-95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. ((catclause) (pro¢ ((type<VERB-TYPE>) (verb<VERB-STRING>))) (pattie((agent<TRANSPORTER-FD>) (affected<TRAISPORTED-ENTITIES-FD>) (location ((cat list) (distinct (<SOUKCE-FD> <CONDUIT-FD> <DESTINATIOI-FD>)))))) ’’dur~g <PROCESS-LEXEME>’J)) (time-relater Figure 4: The Template of an FD-Skeleton for Transportation no inference because each specification type points to a unique FD-Skeleton. Next, the FD-Skeleton Processor determines if each of the essential slots are present; if any of these tests fail, it will note the deficiency and abort. If the message is well-formed, the FD-Skeleton Processor uses the FDSkeleton as a template for forming an FD. It instantiates which is associated with a particular attribute the variables in the FD-Skeleton, each of that appears in the message specification. variable in the FD-Skeleton, the FD-Skeleton Processor obtains the nameof an attribute similar attributes) on the message specification. For each (or a group of For example, the FD-Skeletonthat is used to realize "structural" messagespecification obtains all "part" attributes in the messagespecification, e.g., parts and composed-of. The FD-Skeleton Processor then retrieves message specification. the values that appear on the selected attributes These values are used to instantiate of the variables in the FD-Skeleton. For example, in Figure 2, the descendant-cells and ancestor-cell relations are chosen, and their values are computed by the Noun Phrase Generator and inserted into the template. Next, the Noun Phrase Generator (described below) is invoked to construct an FDrepresenting the noun phrase expressing those values. Finally, the FD-Skeleton Processor splices all of the new noun-phrase FDs into the FD-Skeleton. 3.3 Noun Phrase Generation The NounPhrase Generator (Figure 5) is given a list of concepts, which are the values of the attributes of a message specification. If the NounPhrase Generator is given more than one concept, it recursively invokes itself on each of the concepts, thereby producing a conjoined list of FDs, each of which represents a noun phrase for one of the concepts. Next, it obtains the type of FDs that comprise the group. It then constructs an "enclosing" FDbased on the type, and it embeds each of the FDs resulting from the recursive invocations in the "enclosing" FDand finally returns this entire expression. If it is invoked Callaway 29 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. MAKz-NouN-PIIRASE (Concept-List, Confetti) if length (Concept-List) > 1 then Functional-Deacription-Liet *-- 0 for each Concept in Concept-List do N euJ- Noun- P hr ase *-- make-noun-phrase((Concept)) Functional-Description-List *- enqueue (New-Noun-Phrase, Functional-Description-List) FD-Group-Type *- get-FD-group-type (Functional-Descriptions) Re#ult-FD ,.- make-complex-noun-phrase (Functional-Description-List, F D-Grou p- T trpe ) return ( Reeult-FD) else Concept~-. first (Concept-List) Functional-Description ,--- compute-lex-lnformation (Concept, Conte3:t) Concept-Attributes Deecrlber-Attributes *-- get-concept-attributes (Concept) *-- get-describer-attributes (Concept-Attributes) Cardinal-Attribute,, ~-- b, et-cardinal-attrihutes (Concept-Attributes) ReI-Clause-Attributea *-- get-rel-clatme-attributes (Concept-Attributes) Partitive-Attributes *-- get-partitive-attributes (Concept-Attributes) Describer-F D 4- make-describer-FD (Concept, Describer-Attributes, C ontezt ) Cardinal-FD *-. make-cardinal-FD (Concept, Cardinal-Attributes, Contezt) Relative-Clauae-FD *- make-rel-clause-FD (Concept, Rel-Clause*Attributes, Partitive-FD *-- make-partitive-FD (Concept, Partitive-Attributea, Result-FD *-- merge-FDs (Functlonsl-Description, Cardinal-FD, Contezt) Contezt) Describer-FD, Relatiue-Ciause-FD, Partiti,Je-FD) return (Resuit-FD) Figure 5: The MAKE-NOUN-PHRASE Algorithm with a single concept, it first obtains the lexical informationassociated with the concept. Its next task is to augmentthe basic lexical informationwith lexical informationaboutthe concept’s attributes. To do so, it obtains four types of attributes that mayappear on the concept: describer attributes, e.g., color; cardinal attributes, e.g., number-of-units; relative clause attributes, e.g., coveredby; andpartitive attributes, e.g., subregions. For example,in Figure 2, cell-type is a describer attribute for Plant-Gametewhile amountis a cardinal attribute. For each type that the NounPhrase Generator encounters, it constructs an FD of that group. Eachof these specialized construction functions may recursively invoke the algorithm and augmentthe existing context with information about the current phrase type. For example, the NounPhrase Generator mayaugmenta recursive call with the "number" of the enclosing noun phrase. In this case, the augmentationpermits the system to propagate number information to substructures such as relative clauses, whoseverb endings are influenced by the number feature of the enclosing nounphrase. Finally, the NounPhrase Generatormergesthe resulting FDs into the original lexicai information (also an FD) andreturns this expression to the FD-SkeletonProcessor. 30 BISFAI-95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. The final resulting FD for the message specification passes this to FUF, which realizes depicted in Figure 2 is shownin Figure 1. FARE it as, "During male gametophyte generation, the spore from the megaspore mother cell in the sporangium reproduces to form four haploid plant gametes." 4 Evaluation Because theconclusions of empirical studies should be considerably lessequivocal thanthosederived s To thisend,we from"proof-of-concept" systems, we havetakenan empiricxl approach to evaluation, conducted a formalevaluation of FAREin conjunction withan explanation generator andaa informal evaluation in conjunction witha qualitative modelbuilder. First, we evaluated FAREwithKNICHT [10, 11,9],a robust explaaation generator thatconstructs explanations aboutscientific phenomena. Working z in conjunction, KNIOHT,FARE, and FUFhaveproduced morethana thousand different sentences, including the following: "During egg fertilization, to form a zygote."; reproduction."; an angiosperm sperm cell fertilizes a plant egg cell "Embryosac development is a step of embryo sac formation, which is a step of "Sporogenesis occurs immediately before gametophyte generation."; "The root system is part of the plant and is connected to the mainstem."; "The subregions of the root system include the meristem, which is where root system growth occurs."; and, "During sperm cell transport, 2 angiosperm sperm cells are transported from the pollen tube to the embryosac." Our formal study employedtwo panels of domainexperts. Experts on the t~rst panel served as "writers," i.e., they produced explanations in response to questions. Experts on the second panel served as "judges," i.e., they analyzed different dimensionsof explanations and assigned grades. Noneof the judges were informed about the purpose of the study, and none were aware that they were judging computergenerated ezplanations. Judges were asked to rate the explanations on several dimensions, including overall quality and writing style. Using an A-F (4-0) scale, the system scored within approximately s"half a grade" of the biologists (Table 1). A second study provides evidence that FAREhas a broad range of applicability. To investigate FARE’Sability to realize messagespecifications for a significantly different kind of task, we augmented a qualitative model constructor, TRIPEL[15], with text planning commands. FAREused message specifications produced by the augmentedqualitative model constructor to generate FDs for sentences such as, "The rate of ABAsynthesis in the plant’s mesophyll cells is influenced by one thing: it is negatively affected by the turgor pressure in the plant’s mesophyll cells." Within three weeks, we were able to extend FAREto produce completely correct FDs for this new application: the text it generated eWith three significant exceptions ([6], [1], and [13]), the field of natural language generation has not witnessed the developmentof an "empiricist evaluation ~ school. 7On average, FARErequires 1-2 seconds on a DECAlpha to produce an FD. Sin the tables, 4- denotes the standard error, i.e., the staadard deviation of the mean. Callaway 31 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. Organization IIene ,or II ooe .ll c°n’en’ Writing Correctness SYSTEM 2.37=1=o.]32.65±0.13 2.454-0.16 2.404-0.13 3.07±0.15 Huma41 2.85+o.15 2.93:l:o.16 3.16=1:o.]5 2.95:1:o.16 3.07+o.]6 Table 1: Comprehensive Analysis was favorably evaluated by both the domainexpert and the designer of the qualitative 5 modeling system. Conclusion We have designed, implemented, and evaluated FARE, a robustfunctional realization system.By exploiting aa expressive lexicon, as wellas libraries of functional description skeletons andsemantic transformations, FAREusesmessage specifications drawnfroma large-scale knowledge baseto create functional descriptions, whichaxethenrealized in textby a sophisticated surface generator, FUF.FARE provides fivekeyfunctionalities: it assigns caserolesto content units, orgaadzes content unitsinto embedded phrasestructures, provides localsemantic information, controls theinclusion of selected features, anddetects defective message specifications. Twoempirical studies suggest thatFAREis 9 robust, efficient, capable of producing quality text,andappropriate fora broadrange of textplanners. References [1] A. Cawsey. Ezplanation and Interaction: The Computer Generation of E~lanatory Dialogues. MIT Press, 1992. [2] M. Elhadad. FUF: The universal unifier user manual version 5.0. Technical Report CUCS-038-91, Department of Computer Science, Columbia University, 1991. [3] M. Elhadad. Using Argumentation to Control Lexical Choice: A Functional Unification Implementation. PhD thesis, Columbia University, 1992. [4] R. P. Gabriel. Deliberate writing. In D. D. McDonaldand L. Bolc, editors, Natural Language Generation Systems, pages 1-46. Springer-Verlag, NewYork, 1988. 9Wewouldliketo thank:BrucePorterfor leading the Biology Knowledge Baseproject and forproviding comments on earlier draftsof thispaper;Michael Elhadad, for developing and generously assisting us withFuF;KathyMitchell, Rich Jones,andTeresaChatkoff fortheirworkon FARE;ArtSouther, ourprinciple domainexpert; ErikEllerts, forbuilding the knowledge baseediting tools; PeterClark, fora.ssi-~t&nce withtheevaluation; JeffRickel, thedesigner of TRIPEL; ,rodthe othermembersof the BiologyKnowledge BaseProject,LianeAcker,BradBlumenthal, RichMallory,and Ken Murray. 32 BISFAI-95 From: Proceedings, Fourth Bar Ilan Symposium on Foundations of Artificial Intelligence. Copyright © 1995, AAAI (www.aaai.org). All rights reserved. [5] M. Halliday. System and Function in Language. Oxford University Press, Oxford, 1976. [6] E. Hovy. Pragmaties and natural language generation. Artificial Intelligence, 43:153-197, 1990. [7] M. Kay. Functional grammar. In Proceedings of the Berkeley Linguistic Society, 1979. [8] D. Lenat and R. Guha. Building Large Knowledge Based Systems. Addison-Wesley, Reading, Massachusetts, 1990. [9] J. Lester. Generating Natural Language E~planations from Large-Scale Knowledge Bases. PhD thesis, The University of Texas at Austin, Austin, Texas, 1994. [10] J. Lester and B. Porter. A revision-based model of instructional multi-paragraph discourse pro- duction. In Proceedings of the Thirteenth Cognitive Science Society Conference, pages 796-800, 1991. [11] J. Lester and B. Porter. A student-sensitive discourse generator for intelligent tutoring systems. In Proceedings of the International Conference on the Learning Sciences, pages 298-304, August 1991. [12] W. Mann. An overview of the Penman text generation system. In Proceedings of the National Conference on Artificial [13] V. Mittal. Intelligence, pages 261-265, 1983. Generating Natural Language Descriptions with Integrated Tezt and Ezamples. PhD thesis, University of Southern California, September 1993. [14] B. Porter, J. Lester, K. Murray, K. Pittmaa, A. Souther, L. Acker, and T. Jones. AI research in the context of a multifunctional knowledge base: The botany knowledge base project. Technical Report AI Laboratory AI88-88, University of Texas at Austin, Austin, Texas, 1988. [15] J. Kickel and B. Porter. Automated modeling for answering prediction questions: Selecting the time scale and system boundary. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 1191-1198, 1994. [16] M. M. Vaughaa and D. D. McDonald. A model of revision in natural language generation. In Proceedings of the ~4th Annual Meeting, pages 90-96, Columbia University, 1986. Association for Computational Linguistics. [17] W.-K. C. Wongand K. F. Simmons. A blackboard model of text production with revision. Proceedings of the AAA1Workshop on Tezt Planning and Realization, In St. Paul, Minnesota, August 1988. Callaway 33