Research in Knowledge Acquisition in EXPECT: Current Research and Proposed KA Evaluations Yolanda Gil, Bill Swartout, Jim Blythe, Jihie Kim, Surya Ramachandran, Andre Valente, Marcelo Tallis Expect@isi.edu www.isi.edu/expect/expect-homepage.html This note describes three KA experiments that we would like to propose as part of the HPKB KA Critical Component Experiment (KA CCE). We first give an overview of the KA research topics within the EXPECT project, and then propose three experiments that address three specific claims in the context of this research. Summary of Current KA Research in EXPECT We are working on several areas to enhance and extend EXPECT’s current knowledge acquisition capabilities: (a) A Reusable PSM for Plan Evaluation and Critiquing. We are developing a general-purpose reusable Problem Solving Method (PSM) for plan evaluation and critiquing that includes both ontologies and problem-solving knowledge. This PSM can be seen as a middle-level theory about plan evaluation. A user can build plan evaluation and critiquing tools by adapting this PSM to new domains using EXPECT's knowledge acquisition capabilities. These methods capture knowledge about desirable constraints in the structure of plans and about how to evaluate the use of different types of resources. The methods are structured so that the user does not have to be aware of the details of how the evaluation system works. EXPECT identifies the domain-specific knowledge that is needed to create a plan evaluation system for a new domain, and guides the user to provide it. (b) Supporting KA through an Interdependecy Model: EXPECT generates automatically an Interdependency Model (IM) that captures the relationships between different pieces of knowledge in a KB [Swartout and Gil 94]. EXPECT’s KA tool uses this IM to determine what KA tasks the user needs to complete. We are extending EXPECT to detect additional kinds of interdependencies and constraints for any piece of knowledge in the system. A basic interdependency model captures the interdependencies between ontologies and problem-solving knowledge, i.e., how the ontologies are used during problem solving and what problem-solving knowledge is required given the ontological knowledge about the domain. In effect, this basic IM determines which aspects of a very large background ontology are relevant to the problem at hand. For example, the class city may have attributes such as latitude, longitude, airports, seaports, mayor, subway-system, restaurant, etc. In a transportation application, when the user wants to add a city to the KB, the IM would determine that only airport and seaport information is required and would not ask a user to specify any of the other attributes (such as the restaurants or the name of the mayor). A dynamic interdependency model captures how a problem-solving method is related with other methods in problem solving. A static interdependency model includes the dependency of each problem-solving method with the underlying KR and method representation language. (c) Agenda-based KA: The baseline EXPECT system uses an agenda mechanism as a useful way to guide users during KA. The agenda shows the user what KA tasks remain to be done based on the IM and on different types of errors in the KB that EXPECT detects automatically, and guides the user to resolve them with the KA tools (see [Gil AAAI-94]). Current work extends this mechanism in several ways. First, the current agenda mechanism shows errors without indicating their gravity or helping the user determine which ones to work on first (e.g, grammar errors should be fixed before errors about unmatched goals). Another significant limitation concerns the errors in the agenda, which are 1) not specific enough, in that they did not provide enough information to understand what the problem was, 2) incomplete, in that not all errors in the KB were detected by EXPECT, 3) not recovered from appropriately. This work involves extending EXPECT's underlying error detection mechanism to address these problems. This work will improve all the individual KA tools of EXPECT. (d) Script-Based KA: We are developing a new approach to KA based on the use of KA Scripts that capture typical knowledge base modification sequences. Our tool uses these KA scripts to help users make changes to a knowledge base. We have a principled set of dimensions to organize and populate our library of KA Scripts. Several evaluations have been performed with this tool (see [Gil& Tallis, AAAI-97]). (e) English-based method editor. We are developing an interface that allows users to modify methods by manipulating their paraphrase in English. It allows the user to select a portion of the paraphrase that corresponds to a valid expression and picking from a menu of suggestions for things it can be replaced with. Generating these suggestions is one of the challenging aspects of this work. (f) Support in creating new KBs or extend significantly an existing one. There is not much knowledge at the beginning to form expectations for KA, but a KA tool can create more expectations as the user enters knowledge. This tool tries to help a user create a KB without errors before the problem solver is run. The tool builds expectations based on the representation language (includes a method editor with adaptive forms), based on surface interdependencies (as opposed to the deeper interdependencies detected by the problem solver), and based on a restricted language for users to specify KA constraints and tasks. (g) KA tools for Domain Experts: We are designing a series of realistics KA tasks in the domain of Air Campaign Planning, where we have developed an extensive KBS with EXPECT called INSPECT. INSPECT is a tool to evaluate air campaign plans for consistency, completeness, and use of resources. We are investigating what kinds of changes users would want to make to this knowledge base, and how EXPECT’s KA tools can support users in making these modifications. In addition, we are collaborating with HCI researchers to analyze what kind of user interaction and presentation of information is appropriate for end users (domain experts, users not familiar with EXPECT). There is an ongoing effort to integrate and update the different editors and to make EXPECT more portable across different platforms. In the new interface that we are developing, a Java-based client communicates with a Lisp-based server that contains all the EXPECT reasoners and KA routines. Currently some of the above editors are built in CLIM and others in Java, and in some cases they have been developed as add-on modules to the core EXPECT interface. Instrumentation and Data Collection EXPECT’s KA tools are already instrumented to collect several kinds of information during a KA session: - The times when a user makes an individual modification to the KB, such as changing a problem-solving method - Detailed descriptions of each individual modification that the user makes to the knowledge base, such as the addition of a new substep in a problemsolving method and a description of that substep - The pending KA tasks for the user that EXPECT suggests based on its analysis of the current knowledge base, such as the need to define a new problem-solving method to solve a currently unachievable subgoal. (These pending KA tasks are the contents of the EXPECT agenda.) - The amount of new knowledge added (methods and concepts), i.e., methods or concepts that did not exist in the initial knowledge base - The amount of methods and concepts that were modified, as well as the difference between an initial and a final version described as a set of transformations to the initial version. Note that some of these measurements are related but provide complementary information. Measuring the number of pending KA tasks reflects the error rate over time. It is very useful to help determine how close the user is at each point in time to complete the intended KB modification. The overall time to complete the change does not provide information about cases when the user may be very close to completion and proceed in a way that causes unnecessary additional work. The transformations made to the existing methods give us information about the amount of new information in the end product, but do not tell us how the user got there. Recording the individual modifications provides information about the KA process, i.e., the strategy that the user followed to perform those transformations. Proposed KA Evaluations for EXPECT We would like to propose several experiments to test three aspects of this research: the support provided by an interdependency model during KA, the utility of using a plan evaluation and critiquing PSM (i.e., a middle-level theory about plan evaluation), and the use of natural language for acquisition tasks. Some of the other work described above is undergoing evaluations outside HPKB, such as the Script-based KA tool and the ACP KA effort. Similar evaluations may or may not be repeated for HPKB depending on the resources available and the interest within the HPKB program. We use the following dimensions to describe a KA experiment [Gil 98]: - Hypotheses: claims about the approach that need to be tested KA task: what the user is asked to do with the KA tool Subjects: the type of background that is assumed of the users and their degree of expertise KB domain: the problem domain that the knowledge base covers Underlying KR: knowledge structures, languages, and systems used to represent the KB Experimental setup: the procedure followed to prepare and to execute the experiment Data collected: any data or information recorded during the experiment because it is expected to be useful to prove the claims Results reported: Only a subset of the information recorded is typically found to be useful or interesting and worth reporting to the community at large Conclusions: a generalization of the results reported, some will be evidence for the original claims and others will be surprising findings encovered during the experiment or by analyzing the data collected The last two items are typically determined after the experiment is done, but it is worth keeping in mind during the design of the experiment. We first discuss the hypotheses and claims that we would like to make, and then describe our proposed experiments. Hypotheses and Claims Our experiments are designed to address the following general hypotheses and claims: 1) The use of an explicit model of knowledge interdependencies in a KBS (as described above) decreases the time to adapt an ontology and set of problem solving methods to a new task or related domain (over that required using the same tools without the interdependency model) regardless of the level of experience of the user, because the interdependency model allows the KA tool to point out what are the remaining KA tasks for the user. More detailed models allow KA tools to provide better support. 2) Extending a prototype critiquer with new critiques will be faster and less error-prone if a general-purpose problem-solving method for plan evaluation is used to develop the critiquer, because the new critiques can be organized within the method ontology and because generic problem-solving knowledge can be reused. 3) Using an editor that manipulates an English paraphrase of problem-solving knowledge of the kind that EXPECT generates (task-based and decompositionoriented) allows the user to navigate related concepts and relations allows less experienced users to make small modifications to the COA analyzer, because English is easier for experts to understand than a computer language. We also note with the descriptions of these experiments how we might quantify the benefit from using a large knowledge base. This is important for the programmatic claims. We would like to perform a pre-test with Jim Donlon in order to debug and validate the experimental setup. The experiments proposed here are experiments that we can run using the EXPECT framework, and we would also like to design additional experiments in collaboration with other groups. For example, we would like to show that the Interdependency Models generated by EXPECT can be used by ontology editors to focus the acquisition of new domain knowledge for specific tasks. (1) Extending the Breadth of a KB by Understanding Interdependencies - Hypotheses: The use of an explicit model of knowledge interdependencies in a KBS (as described above) decreases the time to adapt an ontology and set of problem solving methods to a new task or related domain (over that required using the same tools without the interdependency model) regardless of the level of experience of the user, because the interdependency model allows the KA tool to point out what are the remaining KA tasks for the user. More detailed models allow KA tools to provide better support. - KA task: Add new critique to the existing critiquer. For example, add a strength/weakness critique, like surprise. This typically involves adding new concepts and new problem-solving knowledge. - Subjects: Users of different kinds (TBD) who understand the terms (in COA ontologies) and knows how to use EXPECT. - KB domain: COA critiquing - Representation of knowledge acquired: problem-solving knowledge represented in EXPECT’s language, concepts and facts represented in LOOM - Experimental setup: We will use different degrees of incompleteness of the interdependency model: no model, basic + dynamic IM only, basic + dynamic + static IM. - Data to be collected: Changes made, the time, the remaining KA tasks shown and not shown to the user. NOTE: This experiment can be done with other systems, not just with EXPECT. As we mentioned above, an Interdependency Model automatically generated by EXPECT could be used by other systems (such as ontology editors or Protégé) to complement and enhance their KA capabilities. This is possible because of the use of a shared ontology assumed by both systems. The IM could be communicated to other systems as an overlay in the shared ontology. (2) Extending the Breadth of a KB through a Reusable PSM In this experiment, subjects would extend a prototype critiquer with new critiques. - Hypothesis: Extending a prototype critiquer with new critiques will be faster and less error-prone if a general-purpose problem-solving method for plan evaluation is used to develop the critiquer, because the new critiques can be organized within the method ontology and because generic problem-solving knowledge can be reused. - KA Task: Given a prototype critiquer that has some critiques implemented, add the required methods and concepts to make two or three more critiques. This usually involves adding new concepts and new problem-solving knowledge. - Subjects: Users familiar with AI who have some knowledge of KB design and of the domain. - KB domain: COA analysis. - Representation of knowledge acquired: problem-solving knowledge represented in EXPECT’s language, concepts and facts represented in LOOM - Experimental setup: Provide the initial prototype critiquer and describe the new critiques to add. Provide a COA that has mistakes under the new critiques and one that does not, so the user can test the new knowledge. Varied across experiments: in some cases the prototype critiquer will be designed using the plan evaluation PSM, in some not. The tool to be used is EXPECT with a complete interdependency model. We may develop a questionnaire-style tool to facilitate the integration of a new critique in the right place in the ontology to assist less experienced users. - Data to be collected: The time to add new critiques and verify that they are correct. The number of new methods and concepts added and the modifications done to existing ones. NOTE: opportunities to show benefit from large KB: We can note which new concepts are added as the new critiques are added, and which of these would be available in that larger KB. NOTE: We may also test the possibility that the plan evaluation PSM library will make it possible for end users to add new critiques in certain circumstances, but this potential does not have the same status as the claims above. (3) Modifying Existing Knowledge with an English-Based Method Editor In this experiment, subjects would make modifications to the implementation of existing critiques. - Hypothesis: Using an editor that manipulates an English paraphrase of problem-solving knowledge of the kind that EXPECT generates (task-based and decomposition-oriented) allows the user to navigate related concepts and relations allows less experienced users to make small modifications to the COA analyzer, because English is easier for experts to understand than a computer language. - KA Task: Modify existing problem-solving knowledge that implements critique, e.g., extend the evaluation of some feature such as combat power. - Subjects: domain experts who are comfortable with computers and have been given a short familiarisation session with EXPECT and the COA critiquer. - KB domain: COA analysis. - Representation of knowledge acquired: problem-solving knowledge represented in EXPECT’s language, concepts and facts represented in LOOM - Experimental setup: Demonstrate the current critiquer. Show a faulty critique on a COA and explain how it should be changed. Varied across experiments: whether the English-based method editor is available. EXPECT will be used and the critiquer will be based on the plan evaluation PSM. - Data to be collected: Time to change, correctness of change and the user's belief that the critiquer has been changed appropriately. NOTE: Given time, an additional part of this experiment would ask users to add simple new critiques and would vary whether the critiquer used the PSM.