From: AAAI Technical Report SS-95-03. Compilation copyright © 1995, AAAI (www.aaai.org). All rights reserved. A METHODOLOGY FOR MODELING SCIENTIFIC DISCOVERY Sak/r Kocabas* uckoca@tritu.bituet Department of Artificial Intelligence MarmaraResearch Center, PK21, Gebze, Turkey Abstract: Computational modelingof scientific discoveryhas beenemer~/ngas an importantresearch field in artificial intelligence. Buildingtheoretical modelsfor scientific development has until recently beenthe exclusivedomainfor philosophersof science. Withthe advancesin artificial intelligence andespecially in machinelearning, opportunities have arisen for researchers in this field to test the learning methods developedin modeling scientific discovery.In the last fifteen years, a number of systemshavebeendeveloped modelingvarious discoveries ranging from 17th to 20th century physics and chemistry. However,a methodology for building and evaluating such modelshas still not been developed.This paper focuses on the elements of historical discovery models, and the methodsfor their systematic construction and evaluation. 1. Introduction Recentresearch in the computationalstudy of science has revealed a numberof important aspects of science that wereoverlookedby conventionalstudy of science. Shrager and Langley (1990) describe the basic differences between the computational and the conventional philosophical approaches as follows: Conventional philosophical tradition focuses on the structure of scientific knowledge and emphasizes the evaluationof las and theories, while the computationalapproachfocuses on the processes of scientific discovery including the activities of experimentation,data evaluation, andtheory formation. The distinction can be extended even further: Computational study of science concernsnot only with the issues of hypothesisformation,testing andverification, but also a series of other issues in scientific research, ran~nEfrom formulating and selecting research goals, defining research framework,gathering and organizing related knowledge, and through selecting research strategies, methods,tools and techniques,to desiL~ning experiments, data collection, hypothesis and theory formation, theory revision and producing scientific explanations. Anyof these research tasks mayinvolve a variety of planni,~; classification and evaluation problems. Computationalstudy of science is moreconcernedwith the methodologicalissues in science rather than the logico-phUosophical issues whichare the mainconcern of conventionalstudies. Themainpurposeof the formeris to investigate the processes that lead to discoveryin science, andevent~,allyto build modelsof scientific research whichwouldbe usedas artificial researchassistants. Anotherdiscipline, social studyof science,deals withthe social dimension of science, e.g., with howscientific communities form and interact, howresearch projects are developedinto research prograrnrnes, howthese evolve and terminate, and howresearch traditions develop in humansocieties. History of science, on the other hand, investigatesscientific developments throughthe historical records, andprovidesa historical perspectiveto science. Computationalstudy of science draws ideas, perspectives, methods and data from conventional philosophical,social andhistorical studies, but it differs fromthese disciplines in someessential ways:i) it has medium,a computationalmodel, for the reconstruction andanalysisof historical discoveries,ii) usingsuchrood- * Also affiliated with the Department of SpaceSciencesandTechnology,ITU, Maslak,Istanbul, Turkey. 139 els, it can investigate the possible alternative routes to the discovery, andiii) it ,;m~assemblingheuristics for developingmodelsfor assisting research in currently active research projects in science. 2. Types of Discovery Amethodology for the systematic evaluation of discovery modelsshould first of all be capable of distinguishing betweendifferent types of discovery. In other words, it shouldprovide a classification of discovery, so that one can identify a certain type in the history of science in relation to other discoveries. Kocabas(1991c)introduces an implicit classification, whichcan be reformulatedas follows: 1) Logico-Mathematical/Formal Discovery, 2) Theoretical Discovery,and 3) EmpiricalDiscovery.This classification is somewhat in parallel with the categorization of knowledgeby Kocabas(1992a), and reflects order of diminishing degreeof abstraction. Logico-Mathematical/Formal Discovery:. This type of discoverytakes place, as the namesuggests,in the abstract domainof logic and mathematics.FormalDiscoverytakes place in a formaldomainwhichinvolvesabstract entities, their classes and properties. Formaldiscoveryrequires logico-mathematical knowledge as background knowledgefor inductive and/or deductive inference on domainknowledge.F.~amplesof this type of discoveryare the mathematicaltechniquesand formal theories starting from the invention of decimal system and algebra to modernmathematics, and various axiomsystems. TheoreticalDiscovery:.This type of discoveryrequires logico-mathematical,formal and theoretical knowledge, andin general results fromtheoretical analysis and synthesis. Someexamplesto theoretical discoveryfrom the history of scienceare: a) Theintroductionof the special theory of relativity based on Einstein-Lorenz transformations, b) Maxwell’stheory of electromagnetism based on his equations, c) Yukawa’stheory of nuclear forces and mesons, and d) Dirac’s theory of charge symmetry and antiparticles. EmpiricalDiscovery:.Empiricaldiscoveryrelies on experimental and observational data, as well as logicomathematical and formal knowledge. Theoretical knowledge has not beena prerequisite in the early empirical discoveries in the history of science, but in modern empirical research such as in oxide superconductivityand ~cold fusion" experiments,theoretical domainknowledge is necessary. Empiricaldiscoverycan be further divided as heuristic and experimental/observational discovery. Heuristic discoveries take place in attempts to finding qualitative and/or quantitative relationships in experimental data. Somee~mplesto such discoveries are: a) Glauber’sformulationof acid-alkali theoryin the 17th century chemistry, b) Stahl’s discoveryof componential modelsof compounds in the 18th century chemistry, c) Quantitativediscoveriesof simplepysical lawsin classical physics(e.g. Kepler’slaws, Boyle’slaw, Ohm’slaw), Discovery of newquantumproperties and their value distributionto elementaryparticles in particle physics. Experimental/observational discoveries are usually initiated by thechnologicalinventionsor innovations. Two examples are: The discovery of superconductivity by Onnesfollowinghis invention of a methodfor liquifying helinm~and the discovery of newparticle interactions after the invention of cloud chamber. A numberof computationalsystems have been developedin the last 15 years for modelin S these different types of discoveries. Someof the earliest AI systemssuch as LogicTheorist weredesignedto prove theoremsin logic. Among the morerecent systems, AM(Lenat, 1979) stands out as a goodexamplein modelingmathematicaldiscovery. Lenat’s(1983)EURISKO, in its applications to Naval Fleet Design, Evolution, and three dimensionalcircuit design, is a goodexampleto formaldiscoverysystems. Examplesof theoretical discovery models are PI (Thagard & Holyoak, 1985), ECHO (Thagard & Novak, 1990), and GALILEO (Zytkow, 1990). The first systemscan better be characterizedas conceptualdiscovery systems,and as such, are closer to formal discovery systems. GALILEO on the other hand is an interesting exampleof discoveryby theoretical analysis. Scarcity of research in modelingtheoretical discoveryin AI remains tobestriking. Empiricaldiscoveryis an extensivelystudiedarea in AI, and a number of computational models have been designedto investigate its various aspects. Empirical discoverysystemscan be dividedinto two mainclasses as qualitative andquantitativesystems,althoughthis distinction is sometimesirrelevant. Among the qualitative discovery systems, GLAUBER (Langley, et al., 1987), STAHL(Zytkow & Simon, 1986), STAHLp(Rose Langley, 1986), BR-3 (Kocabas, 1991a), KEKADA (Kulkarni & Simon, 1988), Abe (O’Rorke, Morris Schulenburg, 1990), and COAST (Rajamoney, 1990), MECHEM (Valdes-Perez, 1992), and PAULI(ValdesPerez, 1994)can be cited. 140 Amongthe quantitative discovery systems, BACON (Langley, et al., 1987), FAHRENHEIT (Zytkow,1987) and IDS (Nordhansen& Langley, 1987) can be cited prominent examples. BACON was the first successful exampleof quantitaivediscovery,whichhas also attracted the interest of philophers of science. TheIDSsystemon the other hand, integrates quantitative and qualitative methods. 3. Methodology of Building Discovery Models It shouldbe stated at this stage that no discoverymodel can reflect every detail of a discoveryprocess, except perhapswhenthe modelitself is usedin a real-life discovery. In this perspective,historical discoverymodelscan at best be rational reconstructionsof the discoveryprocess. In building such models,it is essential to find out and assemblethe knowledge that has played a si~ificant role in the discovery. 3.1 Collecting Historical Records Collectinginformationabouthistorical discoveries is not an easy task. Onecan identify three main sources of historical recordfor scientific discovery:,history of sciencebooks,scientific research reports, andthe log books used by the scientist duringtheir experimentsleading to the discovery. Mostof the current discoverysystemsrely on publications on the history of physics and chemistry dedicatedto a certain period. Scientific research papers and reports can be used for reconstructing morerecent discoveries. Kocabas(1992a) uses such researh reports and articles in science journals for reconstructing the discoveries in oxide superconductivity.Logbooksare not easy to obtain for their beingpersonalpropertyuntil they are published.It is no surprise that, amongthe discovery models, only Kulkarni & Simon’s (1989) KEKADA basedon a scientist’s log book. 3.2 Assembling the Historcal Records in Standard Formats Building a complexdiscovery modelmayrequire a good deal of time andeffort. Themainproblemin this task is to assemble the necessary knowledgewhich mayhave been used in the discovery. It seemsbest to developa standard format to assemble this knowledge in a structured way. This mayinclude the following slots: Discovery (name, date and responsible scientist(s), Historical Background,Available Technology,Empirical Knowledge,Theoretical Knowledge,Inputs, Algorithmg, 141 Heuristics, Results, Possible Alternative Results, and Effects of the Discovery.Figure1 illustrates an example of this structured representation. This format provides a knowledgelevel view of the discovery, and allowsthe construction of the modelin a systematic way.It also helps to analyze and revise the modelas necessary. Additionally, it enables to see the degree of detail that the modelcan be built for the reconstructionof the discovery. 4. The Discovery Model Computationalmodelsof discoveryneed to be evaluated in accordancewith their type, i.e. for beingformal, theoretical or empirical models. However,there are some commonpointS of evaluation. These can be listed as follows: research goals; methods of knowledge representation; the size, order and role of initial knowledge;theory revision and search methods;methods of learning and discovery;, generality of the system’s methods;andthe system’spredictive abilities. Wecan now lookat these in turn. 4.1. Research Goals Theresearch goals of a discoverysystemvaries with its domainof interest, and the methodsthat it employs.Some systemssuch as AM(Leuat, 1979), EURISKO (1983), GLAUBER (Langley, et al., 1987) aim at discovering new concepts,relationships, heuristics or generalhypotheses. Someother systems such as BR-3(Kocabas, 1991a) and AbE(O’Rorke,Morris & Schulenburg,1990) start with an impasse, and aim at consistency and/or completeness as their maingoal, whilediscoveryis a by-productof their activities. Yet others such as COAST (Rajamoney,1990) and GENSIM/HYPGENE (Karp, 1990) search for consistent explanations, and GALILEO (Zytkow,1990) for moreexpressive laws. The research methodsof a system must be adequate enoughfor its research goals. For example,a consistency oriented system must inevitably have theory revision capabilities, and a completenessoriented system must have the ability to generate and test newconcepts and hypotheses. A few systems such as KEKADA (Kulkarni & Simon, 1988) and CER(Kocabas, 1989; 1992b) capableof generatingtheir ownresearch goals by detecting problemstates (inconsistencies, incompletenesses and anomalies) in their knowledgebase. The system description of a computationalmodelmustclearly state its goals, or howthey are generated. Figure1, Example of formatteddata for the discoveryof ¥-Ba-Cu-O superconductor. DiscoveryEvent Discovery: Y-Ba-Cu-Ooxide superconductor Dateof Discovery:.16th February,1987.Paul Chuet al. Source: Physics Today Background Historical Baeksround/Problems: < to be completed> Theoretical Background: Several theories on superconductivityhad beendeveloped.Oneof these theories was the BCStheory whichexplain.~ the phenomenon in terms of the conservation of angular and translational momentum. Current theoretical knowledgeimplied the impossibility of oxide superconductorswith higher Tcs than metal or alloy superconductors.(The theories were based on the accumulatedexperimental knowledge.) Knowledge about the relationships betweenheat conductivityand electxical conductivity. Typesof Empirical Knowledge and Technology:.Oxidesuperconductors had been knownsince 1973 whenD. Johnstondiscoveredsuperconductivityin LiTi204at temperaturesup to 13.7K.In 1975, A. Sleight discovered superconductivity in BaPb(1-x)Bi(x)O3with a Tc up to 13IC In 1986, La-Ba-Cu-O superconductor with around35Kwasdiscoveredby Bednorzand Mueller. Knowledge about elementsin the Periodic Table. Processes for synthesis of doubleand triple oxide compounds. Elementsubstitutions in such compounds. DiscoveryProcess DiscoveryGoals: Search for oxides with higher Tcs than La-Ba-Cu-O compound. Inputs: La-Ba-Cu-Osuperconducting compound,chemical elements, knowledgeabout the synthesis of doubleandtriple oxides. Algorithms: Elementsubstitutions in La-Ba-Cu-O compound.Select an element from Periodic Table with electronic properties similar to La, and substitute it with this elementin La-Ba-Cu-O under the relevant experimentalconditions. Outputs:Substitution of Yfor La in La-Ba-Cu-O,and the discovery of Y-Ba-Cu-O superconductor. SecondaryResults: Othersubstitutions maybe possible to yield better oxide superconductors. Alternative Outputs: < to be completed> TheoryDevelopment: Thehypothesisthat substanceswith highest Tcsare metal alloys wasfalsified oncemore. Explanations Typesof Explanations:Therole of crystal structure in oxide superconductivitywasdiscussed. Explanations were based on electron-phononinteractions.tion with predicate logic representation. Eachrepresentation schemehas its ownadvantagesand disadvantagesin NewResearchProblems: Couldthere be other oxide compounds with higher Tcs? NewResearchDirections: Searchfor oxides with higher Tcs. Explainoxide superconductivity. 4.2. KnowledgeRepresentation Methods Knowledge representation still remain~to be an important issue in computational models,as it affects the efficiency of a system’s methodsof search, learning and discovery. Early models, (e.g. GLAUBER, STAHL and BR-3)employrelatively simple representation methods such as list structures and predicate expressions.Recent discovery systems (e.g. AbE,COAST, IDS) employmore structured knowledgerepresentation schemes such as framesand qualitative processschemas,often in combination with predicate logic representation.Eachrepresentation schemehas its ownadvantagesand disadvantagesin terms of the implementation (see, e.g., Kocabas,1991bfor details). Therefore,the choiceof knowledge representation schemesor their integration is an importantissue in the de~i~mof computationalmodels. Consequently,the systemdescription of a modelmustexplicitly state the knowledgerepresentation methodsthat it employs,and howthey are integrated. 142 4.3. The Order, Size, and the Role of Initial Knowledge 4.5. Learning and Discovery Methods Discovery systems utilize deductive and inductive Initially, the discovery systems were divided into two methods, but until now, there is no discovery modelthat broad groups as data- and theory-driven systems. Later uses analogical reasoning in a non- trivial sense. Logicoon, the distinction beganto appear as superficial, for some mathematical, formal and theoretical discovery systems systems (e.g. STAHLp and BR-3) start as data driven such as AM, EURISKO, PI, GALILEO and ECHO models and acquire theory-driven system characteristics extensively rely on deductive methods, while BACON during their operations. The size of initial knowledgeand employs inductive methods. Systems like STAHL, how muchof it is utilized by a discovery system is an STAHLp,BR-3, KEKADA and IDS employ both indueimportant feature in the correct evaluation of that system. tire and deductive methods. A system’s discovery methods Somesystems process data incrementally (e.g., STAHLp, cannot be separated from its search and theory revision BR-3, AbE, COAST),and the order of data given to the methods. systemaffects its behavior (see, e.g., Koeabas,1991a). the discovery modelis an incremental system, its descrip4.6. Generality of Methods tion must evaluate the effects of data order. Data size is Anotherimportant metric in the evaluation of a discovery important for the evaluation of any discovery modelto test modelhas been the generality of its discovery and search the effectiveness of its search methods. methods. Some discovery models such as EURISKO and BACON rely on rather general heuristics for their dis4.4. Theory Revision and Search Methods coveries. Similarly, BR-3 employs algebraic rules to One of the prominent problems that haunt discovery reduce its search space. However,there seemsto be a limit systems with large search spaces is the control of search. for the uses of such general heuristics, as systems with Whatever search methods are used, the size of the more and structured domain knowledge must inevitably effectively used initial knowledgebase is a significant use domainheuristics for constraining search. Therefore, indicator of the system’s dimensions. Modelswith large the size and the type of the discovery model must be search spaces utilize a numberof search control methods. considered in evaluating the generality of a system’s These can be as widely varied as logical constrains (as in methods. STAHLp), algebraic constraints (as in BR-3 and GALILEO), general rules (as in EURISKO,BACON, 4.7. Predictive Abilities KEKADA and IDS), and domain constraints (as in BR-3, Predictive ability can be defined as a system’s ability to KEKADA, AbE, COAST and GENSIM/HYPGENE). generate a set of propositions which were undecidable The description of a computational model must include prior to the discovery are decidable afterwards. Predictive its search methods, and explain why those particular ability is an importantfeature of theoretical and empirical methodsare used rather than the others. systems. Doesthe system’s predictive ability improveas it Theoryrevision is becomingan indispensable feature of discovers newconcepts, hypotheses or relationships? The discovery models. This is in line with the understanding answer to this question is also an indication of how that most scientific discoveries are the results of generateffectively the system integrates and uses the knowledge ing and testing hypotheses. If the discovery system has that it has discovered. Discoveries of systems like BACON theory revision capabilities, first these must be described and GALILEO are validly appficable to an indefinite in detail in general terms, and then explained with a numberof physical states. However,by themselves, these particular exampie. Also, where, howand whythe system’s systems do not apply their knowledgeto physical states. search and theory revision methods fail need also be IDS, FAHRENHEITand BR-3, on the other hand, explained. Artificial data can be used in testing the effectively utilize the knowledgethey discovered in new effectiveness of a system’s theory revision and search problem states. methods. 143 Lenat, D.B.(1979). Onautomatedscientific theory formation:A case 5. Conclusion study using the AMprogram.In J. Hayes,D. Michieand LI. Miknlich Computational modelingof scientific discoveryhas been emerging as an importantresearchareain artificial intelligence, andthe numberof computationalmodelsis steadily increasing.A methodology for systematicevaluation of thesesystemsis necessary,not onlyfor researchers in this field, butalso for the interestedphilosophers and historiansof science.First of all, a methodology needsto be developedfor buildin~ historical discovery models. Secondly,a methodof classifcafion for suchmodelsfor a systematicevaluationis needed.Thena set of evaluation criteria needs to be identified, whichcan include the researchgoals, knowledgerepresentationmethods,the role of initial knowledge,theory revision andsearch methods,learninganddiscoverymethods,generalityand the system’spredictiveabilities. In this paperwehave discussed these issues andprovidedexamplesfor the methodsto be used. (Eds.) MachineIntellligence 9, (251-283).NewYork:Halstead. References geologicalrevolution. In: J. Shragerand P. Langley(Eds.) Computational Modelsof Scientific Discoveryand TheoryFormation. Morgan Lenat, D.B. (1983). EURISKO: A programthat learns newheuristics anddomain concepts.Artificial Intelligence21, Nos.1-2, (61-98). Nordhausen,B. &Langley,P. Towardsan integrated discoverysystem. Proceedingsof the TenthInternational Joint Conferenceon Artificial Intelligence,198-200. O’Rorke,P., Morris, S. &Schulenburg,D. (1990). Theoryformation by abstraction. In: J. Shragerand P. Langley(Eds.) Computational Models of Scientific Discovery and Theory Formation. Morgan Kaufmann,San Mateo, CA. Rajamoney, S.A. (1990). A computationalapproachto theory revision. In: J. ShragerandP. Langley(FAs.) Computational Modelsof Scientific Discoveryand TheoryFormation. MorganKaufmann,San Mateo, CA. Thagard,P. &Holyoak,K. (1985). Discoveringthe wavetheory sound: Inductive inference in the context of problemsolving. Proceedings of the Ninth International Joint Conferenceon Artificial Intelligence,610-612. Thagard, P. &Nowak,G. (1990). The conceptual structure of the Darden,L (1987). Viewingthehistoryofscienceas compiledhindsight. Kaufmann,San Mateo, CA. TheAI Magazine8, No. 2, (33-42). P. Langley(Eds.) ComputationalModelsof Scientific Discoveryand Valdes-Perez, 11. (1992). Theory..driven discovery of reaction pathways in the MECHEM system. Proc. of the Tenth National Theory Formation. MorganKaufmann,San Marco, CA. Conferenceon Artificial Intelligence (pp. 63-69). San Jose, CA:AAAI Karp,P.D. (1990). Hypothesisformationas design. In: J. Shragerand Press. Kocabas, S. (1989). Functional Categorization of knowledge: Valdes-Perez,R. (in press). Discoveryof conservedproperties Applicationsin modelingscientific discovery. PhDThesis, Department particle physics: A comparisonof two models. MachineLearning. Zytkow, J. (1987). Combiningmanysearches in the FAHRENHEIT of Electronic and Electrical Engineering, King’s College London, University of London. Kocabas,S. (1991a).Conflictresolutionas discoveryin particle physics. discovery system. Proceedingsof the Fourth International Workshop MachineLearning,6, 277-309. Kocabas,S. (1991b). A reviewoflearning. TheKnowledge Engineering on MachineLearning, Los Altos, CA:MorganKaufmann,281-287. Zytkow,J. (1990). Derivinglaws throughanalysis of process and Review, 6, 3. equations. In: J. Shragerand P. Langley(Eds.) Computational Models of Scientific Discoveryand TheoryFormation.MorganKaufmann,San Kocabas,S. (1991c).Computational modelsof scientific discovery.The Mateo, CA. Knowledge EngineeringReview,6, 259-305. Kocabas,S. (1992a). Functional categorization of knowledge.AAAI Spring Symposium Series, 25-27March1992, Stanford, CA. Zytkow,J. &Simon,H.D.(1986). A theoryof historical discovery:The construction of componentialmodels.MachineLearning,I, 107-137. Kocabas,S. (1992b). Fourlevels of learning and representation modelingscientific discovery. First Turkish Symposium on AI and References on Superconductivity Khurana,A. (1987a). Searchand discovery: Superconductivityseen Neural Networks,25-26June, Bilkent, Ankara. Kulkarni, D. & Simon,H.D. (1988). Theprocesses of scientific discovery.CognitiveScience,12, 277-309. Langley, P., Simon,H.A., Bradshaw,G.L., Zytkow,J.M. (1987). Scientific discovery:.Computational explorationsof the creative proce.s.ses. Cambridge,MA:TheMITPress. abovethe boiling point of nitrogen. PhysicsToday,April, 1987,17-23. Khurana,A. (1987b).Searchand discovery: Bednotzand Muellerwin NobelPrize for newsuperconductingmaterials. PhysicsToday,December, 1987,17-19. 144