From: AAAI Technical Report SS-02-06. Compilation copyright © 2002, AAAI (www.aaai.org). All rights reserved. Mining Answers from Texts and KnowledgeBases: Our Position Bruce Porter, Ken Barker, Paul Navratil, Dan Tecuci, JamesFan, Peter Peter Yeh Department of Computer Sciences Universityof Texasat Austin pofter@cs.utexas.edu Knowledge SystemsGroup BoeingMathematicsand Computing Technology peter.e.clark@boeing.corn Recent advances in question answering from text have shown that information retrieval, natural language processing and machinelearning techniques can go a long wayin retrieving answersto certain types of questionsfrom large bodies of text. Questions requiring morereasoning and inference, or those whoseanswersrequire synthesis or explanation are more difficult. Systemsthat reason over domin-specific knowledge bases are capable of more sophisticated behavior than answerretrieval systems, but are expensive in terms of their knowledgerequirements. The problem of answering difficult questions from the knowledgeexix’essed in text can be attacked from both ends: by improving answer retrieval from large corpora, and by makingit possible for formal representations of knowledgecontained in text to be authored more quickly and easily. Research Interests And Experience Ourresearch group has interests in manyaspects across the spectrum of this problem. Wehave experience in the knowledgerepresentation issues involvedin building large knowledgebases that capture knowledgecontained in text as well as in simplifying the process of knowledgecapture (Barker, Porter, and Clark 2001;Clark et al. 2001; Clark. Thompson,and Porter 2000, Clark and Porter 1997; Fan et al. 2001). Wehave workedin natural language generation of explanations from knowledgebases (Lester and Porter 1997) and reasoning for question answering (Rickel and Porter 1997; Clark, Thompsonand Porter 1999). Wehave also investigated the relationship betweentext’s linguistic form and its meaning (Barker 1998; Barker and Szpakowicz1998). Knowledge Capture Tools Experts Clark quickly. One of the ultimate goals of our research is to make it as easy for authors of text to encode formal representationsof their knowledge as it is for themto build webpagesof it. Byputting intuitive knowledgeengineering tools in the hands of the authors of text, we avoid the bottleneck of knowledgeengineering having to go through knowledge engineers. And by bringing the rate of generating small knowledgebases closer to the rate of producing textual documents, we increase the amountof formally represented knowledgeavailable for specific domains, allowing more sophisticated reasoning for question answering. Our work on accelerating the process of knowledge engineering has three main parts: 1) building generic, reusable representations to seed the knowledgeengineering task; 2) developingtools for intelligent knowledgeaccess and integration; 3) investigating methodsfor revising and augmenting knowledge from information gleaned from text. We are building a library of reu~ble, cobble, domain-independent knowledge components (Barker, Porter, and Clark 2001). The library contains a small numberof generic Entities, Events and Roles (Fan et al. 2001) and a restricted language for combiningthese to represent more complex knowledge. As more complex conceptsare encoded,these becomepart of the library to be used as building blocks for morecomplexconcepts still. Wecontinue to expand the coverage and granularity of componentsto allow users to express moreknowledgewith less effort. Knowledge Integration In order to ensure that this systemis intuitive to users unfamiliar with knowledgeengineering, we are stressing the importanceof allowing users to express themselvesin familiar ways. They should not be burdened with the representational requirementsof formal reasoning. To that end, we are investigating waysto bridge the gap betweena user’s unrestricted vocabularyand the restricted vocabulary of our component library. In our development of the componentiibrmy we have made use of dictionaries and other linguistic resources to makethe library components intuitive. Wehave encodedlinks to WordNet (Miller 1990) For Domain Weare currently doing research on a project under DARPA’s Rapid KnowledgeFormation program that will allow domainexperts to build knowledgebases easily and ~ght0 2000,American Asm~ation for Artificial lnteUigence (w~vw.mmi.ors). Allright8rc~rv~l. 8O for each cogent, allowing the system to guide the user to appropriate componentsthrough the WordNet hierarchy. Weare also working on a system to mapfrom a user’s casual linguistic su’ueturesto the precise structures required for reasoning in the knowledge base. To integrate knowledgeexpressed by the user into a growingknowledge base, we plan to allow users to express knowledge imprecisely and to translate that impreciseexpressioninto the exact form required in the knowledge base. For example,we expectthe user to be able to refer to "’airplane flaps" instead of the moreprecise "flap pan of the wing part of an airplane"; "laser scalpel" instead of "laser fulfilling the purposeof scalpel as instrumentof cutting". Our knowledge integration research will also investigate howto use prior knowledge to help in the interpretation of imprecisely expressedknowledge.Integration also requires the ability to expandor modifyexisting knowledge based ona user’sinput. Text, KnowledgeModels AndQuestion Answering Anotherarea of our interest is in the interplay betweenthe tasks of information extraction from text, model construction, and question answering. Weare developing an architecture in whichansweringsophisticated questions is treated fundamentallyas a task of modelconstruction (as opposedto informationretrieval), and in whichthese three tasks are tightly integrated (as opposedto a "waterfall" approach, in which information extraction results in a model, and then the model is subsequently used for question answering). In this architecture, question answering provides requirements for a model of the scenario of interest; background knowledge provides candidate componentsfrom which that modelcan be built; data suggests which of these candidate componentsare relevant; and the partially built modelitself suggests new questionsto pose to the text data. In other words,questions guide informationretrieval; informationretrieval suggests modelcomponents;and models suggest further questions. Throughthis cycle, a coherent picture of a u:enario can thus be built. References Barker, K., Porter, B., and Clark, P. 2001. A Library of Genetic Concepts for ComposingKnowledgeBases. First International Conference on KnowledgeCapture, 14-21. Victoria. Barker, K. 1998. Semi-AutomaticRecognition of Semantic Relationships in English Technical Texts. PhD. diss., School of Information Technology and Engineering, University of Ottawa. Barker, K. and S. Szpakowicz 1998. Semi-Automatic Recognition of Noun Modifier Relationships. In Proceedings of COLING-A CL ’98, 96-102. Montr~l. Clark, P., Thompson, J., Barker, K., Porter, B., Chandhri, V., Rodriguez,A., Thom~r~, J., Mishra,S., Gil, Y., Hayes, P., Reichherzer, T. 2001. Knowledge Entry as the Graphical Assemblyof Components.First international Conferenceon KnowledgeCapture, 22-29. Victoria. Clark, P., Thompson,J., and Porter, B. 2000. Knowledge Patterns. Knowledge Representation Conference ( KR’2000). Clark, P., Thompson, J., and Porter, B. 1999. A Knowledge-Based Approachto Question-Answering.In the AAAl’99 Fall Symposiumon Question.Answering Systems, 43-51, CA:AAAI Press. Clark, P. and Porter, B. 1997. Building Concept Representations from Reusable Components.In AAAi’97, 369-376, CA:AAAI Press. Fan, J., Barker, IL, Porter, B., and Clark, P. 2001. Representing Roles and Purpose. First lmemational Conferenceon KnowledgeCapture, 38-43. Victoria. J. Lester and potter, B. 1997. Developingand Empirically Evaluating Robust Explanation Generators: The KNIGHT Experiments.Computational Linguistics 23( 1 ):65-101. G. Miller ed. 1990. WordNet: An On-Line Lexical Database.International Journalof Lexicography3(4). J. Rickel and Porter, B. 1997. AutomatedModeling of Complex Systems to Answer Prediction Questions. Artificial Intelligence Journal93( 1-2):201-260. Mining Answers FromTexts And Knowledge Bases Theresearch interests of our groupare a close matchto the aims of the "Mining Answersfrom Texts and Knowledge Bases" symposium. Wehave experience in acquiring knowledgefrom text, in reasoning over knowledgebases for answer/explanationgeneration and in building tools to help end users (such as authors of specialized texts) build knowledgebases quickly and easily. Webelieve all three of these elementsare essential to the development of the next generation of question answeringsystems. 81