A METHODOLOGY FOR MODELING SCIENTIFIC DISCOVERY uckoca @ tritu.bituet

advertisement
From: AAAI Technical Report SS-95-03. Compilation copyright © 1995, AAAI (www.aaai.org). All rights reserved.
A METHODOLOGY FOR MODELING
SCIENTIFIC
DISCOVERY
Sak/r Kocabas*
uckoca@tritu.bituet
Department
of Artificial Intelligence
MarmaraResearch Center, PK21, Gebze, Turkey
Abstract: Computational
modelingof scientific discoveryhas beenemer~/ngas an importantresearch field
in artificial intelligence. Buildingtheoretical modelsfor scientific development
has until recently beenthe
exclusivedomainfor philosophersof science. Withthe advancesin artificial intelligence andespecially in
machinelearning, opportunities have arisen for researchers in this field to test the learning methods
developedin modeling
scientific discovery.In the last fifteen years, a number
of systemshavebeendeveloped
modelingvarious discoveries ranging from 17th to 20th century physics and chemistry. However,a
methodology
for building and evaluating such modelshas still not been developed.This paper focuses on
the elements of historical discovery models, and the methodsfor their systematic construction and
evaluation.
1. Introduction
Recentresearch in the computationalstudy of science has
revealed a numberof important aspects of science that
wereoverlookedby conventionalstudy of science. Shrager
and Langley (1990) describe the basic differences
between the computational and the conventional
philosophical approaches as follows: Conventional
philosophical tradition focuses on the structure of
scientific knowledge
and emphasizes
the evaluationof las
and theories, while the computationalapproachfocuses
on the processes of scientific discovery including the
activities of experimentation,data evaluation, andtheory
formation.
The distinction can be extended even further: Computational study of science concernsnot only with the
issues of hypothesisformation,testing andverification,
but also a series of other issues in scientific research,
ran~nEfrom formulating and selecting research goals,
defining research framework,gathering and organizing
related knowledge, and through selecting research
strategies, methods,tools and techniques,to desiL~ning
experiments, data collection, hypothesis and theory
formation, theory revision and producing scientific
explanations. Anyof these research tasks mayinvolve a
variety of planni,~; classification and evaluation problems.
Computationalstudy of science is moreconcernedwith
the methodologicalissues in science rather than the
logico-phUosophical
issues whichare the mainconcern
of
conventionalstudies. Themainpurposeof the formeris
to investigate the processes that lead to discoveryin
science, andevent~,allyto build modelsof scientific research whichwouldbe usedas artificial researchassistants.
Anotherdiscipline, social studyof science,deals withthe
social dimension
of science, e.g., with howscientific communities form and interact, howresearch projects are
developedinto research prograrnrnes, howthese evolve
and terminate, and howresearch traditions develop in
humansocieties. History of science, on the other hand,
investigatesscientific developments
throughthe historical
records, andprovidesa historical perspectiveto science.
Computationalstudy of science draws ideas, perspectives, methods and data from conventional
philosophical,social andhistorical studies, but it differs
fromthese disciplines in someessential ways:i) it has
medium,a computationalmodel, for the reconstruction
andanalysisof historical discoveries,ii) usingsuchrood-
* Also affiliated with the Department
of SpaceSciencesandTechnology,ITU, Maslak,Istanbul, Turkey.
139
els, it can investigate the possible alternative routes to
the discovery, andiii) it ,;m~assemblingheuristics for
developingmodelsfor assisting research in currently
active research projects in science.
2. Types of Discovery
Amethodology
for the systematic evaluation of discovery
modelsshould first of all be capable of distinguishing
betweendifferent types of discovery. In other words, it
shouldprovide a classification of discovery, so that one
can identify a certain type in the history of science in
relation to other discoveries. Kocabas(1991c)introduces
an implicit classification, whichcan be reformulatedas
follows: 1) Logico-Mathematical/Formal
Discovery, 2)
Theoretical Discovery,and 3) EmpiricalDiscovery.This
classification is somewhat
in parallel with the categorization of knowledgeby Kocabas(1992a), and reflects
order of diminishing degreeof abstraction.
Logico-Mathematical/Formal
Discovery:. This type of
discoverytakes place, as the namesuggests,in the abstract
domainof logic and mathematics.FormalDiscoverytakes
place in a formaldomainwhichinvolvesabstract entities,
their classes and properties. Formaldiscoveryrequires
logico-mathematical
knowledge as background
knowledgefor inductive and/or deductive inference on
domainknowledge.F.~amplesof this type of discoveryare
the mathematicaltechniquesand formal theories starting
from the invention of decimal system and algebra to
modernmathematics, and various axiomsystems.
TheoreticalDiscovery:.This type of discoveryrequires
logico-mathematical,formal and theoretical knowledge,
andin general results fromtheoretical analysis and synthesis. Someexamplesto theoretical discoveryfrom the
history of scienceare: a) Theintroductionof the special
theory of relativity based on Einstein-Lorenz transformations, b) Maxwell’stheory of electromagnetism
based on his equations, c) Yukawa’stheory of nuclear
forces and mesons, and d) Dirac’s theory of charge
symmetry
and antiparticles.
EmpiricalDiscovery:.Empiricaldiscoveryrelies on experimental and observational data, as well as logicomathematical and formal knowledge. Theoretical
knowledge
has not beena prerequisite in the early empirical discoveries in the history of science, but in modern
empirical research such as in oxide superconductivityand
~cold fusion" experiments,theoretical domainknowledge
is necessary. Empiricaldiscoverycan be further divided
as heuristic and experimental/observational
discovery.
Heuristic discoveries take place in attempts to finding
qualitative and/or quantitative relationships in experimental data. Somee~mplesto such discoveries are:
a) Glauber’sformulationof acid-alkali theoryin the 17th
century chemistry, b) Stahl’s discoveryof componential
modelsof compounds
in the 18th century chemistry, c)
Quantitativediscoveriesof simplepysical lawsin classical
physics(e.g. Kepler’slaws, Boyle’slaw, Ohm’slaw),
Discovery of newquantumproperties and their value
distributionto elementaryparticles in particle physics.
Experimental/observational
discoveries are usually initiated by thechnologicalinventionsor innovations. Two
examples are: The discovery of superconductivity by
Onnesfollowinghis invention of a methodfor liquifying
helinm~and the discovery of newparticle interactions
after the invention of cloud chamber.
A numberof computationalsystems have been developedin the last 15 years for modelin
S these different types
of discoveries. Someof the earliest AI systemssuch as
LogicTheorist weredesignedto prove theoremsin logic.
Among
the morerecent systems, AM(Lenat, 1979) stands
out as a goodexamplein modelingmathematicaldiscovery. Lenat’s(1983)EURISKO,
in its applications to Naval
Fleet Design, Evolution, and three dimensionalcircuit
design, is a goodexampleto formaldiscoverysystems.
Examplesof theoretical discovery models are PI
(Thagard & Holyoak, 1985), ECHO
(Thagard & Novak,
1990), and GALILEO
(Zytkow, 1990). The first
systemscan better be characterizedas conceptualdiscovery systems,and as such, are closer to formal discovery
systems. GALILEO
on the other hand is an interesting
exampleof discoveryby theoretical analysis. Scarcity of
research in modelingtheoretical discoveryin AI remains
tobestriking.
Empiricaldiscoveryis an extensivelystudiedarea in AI,
and a number of computational models have been
designedto investigate its various aspects. Empirical
discoverysystemscan be dividedinto two mainclasses as
qualitative andquantitativesystems,althoughthis distinction is sometimesirrelevant. Among
the qualitative discovery systems, GLAUBER
(Langley, et al., 1987),
STAHL(Zytkow & Simon, 1986), STAHLp(Rose
Langley, 1986), BR-3 (Kocabas, 1991a), KEKADA
(Kulkarni & Simon, 1988), Abe (O’Rorke, Morris
Schulenburg, 1990), and COAST
(Rajamoney, 1990),
MECHEM
(Valdes-Perez, 1992), and PAULI(ValdesPerez, 1994)can be cited.
140
Amongthe quantitative discovery systems, BACON
(Langley, et al., 1987), FAHRENHEIT
(Zytkow,1987)
and IDS (Nordhansen& Langley, 1987) can be cited
prominent examples. BACON
was the first successful
exampleof quantitaivediscovery,whichhas also attracted
the interest of philophers of science. TheIDSsystemon
the other hand, integrates quantitative and qualitative
methods.
3. Methodology of Building Discovery Models
It shouldbe stated at this stage that no discoverymodel
can reflect every detail of a discoveryprocess, except
perhapswhenthe modelitself is usedin a real-life discovery. In this perspective,historical discoverymodelscan at
best be rational reconstructionsof the discoveryprocess.
In building such models,it is essential to find out and
assemblethe knowledge
that has played a si~ificant role
in the discovery.
3.1 Collecting Historical Records
Collectinginformationabouthistorical discoveries is not
an easy task. Onecan identify three main sources of
historical recordfor scientific discovery:,history of sciencebooks,scientific research reports, andthe log books
used by the scientist duringtheir experimentsleading to
the discovery. Mostof the current discoverysystemsrely
on publications on the history of physics and chemistry
dedicatedto a certain period. Scientific research papers
and reports can be used for reconstructing morerecent
discoveries. Kocabas(1992a) uses such researh reports
and articles in science journals for reconstructing the
discoveries in oxide superconductivity.Logbooksare not
easy to obtain for their beingpersonalpropertyuntil they
are published.It is no surprise that, amongthe discovery
models, only Kulkarni & Simon’s (1989) KEKADA
basedon a scientist’s log book.
3.2 Assembling the Historcal
Records in
Standard Formats
Building a complexdiscovery modelmayrequire a good
deal of time andeffort. Themainproblemin this task is
to assemble the necessary knowledgewhich mayhave
been used in the discovery. It seemsbest to developa
standard format to assemble this knowledge in a
structured way. This mayinclude the following slots:
Discovery (name, date and responsible scientist(s),
Historical Background,Available Technology,Empirical
Knowledge,Theoretical Knowledge,Inputs, Algorithmg,
141
Heuristics, Results, Possible Alternative Results, and
Effects of the Discovery.Figure1 illustrates an example
of this structured representation.
This format provides a knowledgelevel view of the
discovery, and allowsthe construction of the modelin a
systematic way.It also helps to analyze and revise the
modelas necessary. Additionally, it enables to see the
degree of detail that the modelcan be built for the
reconstructionof the discovery.
4. The Discovery Model
Computationalmodelsof discoveryneed to be evaluated
in accordancewith their type, i.e. for beingformal, theoretical or empirical models. However,there are some
commonpointS of evaluation. These can be listed as
follows: research goals; methods of knowledge
representation; the size, order and role of initial
knowledge;theory revision and search methods;methods
of learning and discovery;, generality of the system’s
methods;andthe system’spredictive abilities. Wecan now
lookat these in turn.
4.1. Research Goals
Theresearch goals of a discoverysystemvaries with its
domainof interest, and the methodsthat it employs.Some
systemssuch as AM(Leuat, 1979), EURISKO
(1983),
GLAUBER
(Langley, et al., 1987) aim at discovering new
concepts,relationships, heuristics or generalhypotheses.
Someother systems such as BR-3(Kocabas, 1991a) and
AbE(O’Rorke,Morris & Schulenburg,1990) start with
an impasse, and aim at consistency and/or completeness
as their maingoal, whilediscoveryis a by-productof their
activities. Yet others such as COAST
(Rajamoney,1990)
and GENSIM/HYPGENE
(Karp, 1990) search for consistent explanations, and GALILEO
(Zytkow,1990) for
moreexpressive laws.
The research methodsof a system must be adequate
enoughfor its research goals. For example,a consistency
oriented system must inevitably have theory revision
capabilities, and a completenessoriented system must
have the ability to generate and test newconcepts and
hypotheses. A few systems such as KEKADA
(Kulkarni
& Simon, 1988) and CER(Kocabas, 1989; 1992b)
capableof generatingtheir ownresearch goals by detecting problemstates (inconsistencies, incompletenesses
and anomalies) in their knowledgebase. The system
description of a computationalmodelmustclearly state
its goals, or howthey are generated.
Figure1, Example
of formatteddata for the discoveryof ¥-Ba-Cu-O
superconductor.
DiscoveryEvent
Discovery: Y-Ba-Cu-Ooxide superconductor
Dateof Discovery:.16th February,1987.Paul Chuet al.
Source: Physics Today
Background
Historical Baeksround/Problems:
< to be completed>
Theoretical Background:
Several theories on superconductivityhad beendeveloped.Oneof these theories was
the BCStheory whichexplain.~ the phenomenon
in terms of the conservation of angular and translational
momentum.
Current theoretical knowledgeimplied the impossibility of oxide superconductorswith higher Tcs
than metal or alloy superconductors.(The theories were based on the accumulatedexperimental knowledge.)
Knowledge
about the relationships betweenheat conductivityand electxical conductivity.
Typesof Empirical Knowledge
and Technology:.Oxidesuperconductors had been knownsince 1973 whenD.
Johnstondiscoveredsuperconductivityin LiTi204at temperaturesup to 13.7K.In 1975, A. Sleight discovered
superconductivity in BaPb(1-x)Bi(x)O3with a Tc up to 13IC In 1986, La-Ba-Cu-O
superconductor with
around35Kwasdiscoveredby Bednorzand Mueller. Knowledge
about elementsin the Periodic Table. Processes
for synthesis of doubleand triple oxide compounds.
Elementsubstitutions in such compounds.
DiscoveryProcess
DiscoveryGoals: Search for oxides with higher Tcs than La-Ba-Cu-O
compound.
Inputs: La-Ba-Cu-Osuperconducting compound,chemical elements, knowledgeabout the synthesis of
doubleandtriple oxides.
Algorithms: Elementsubstitutions in La-Ba-Cu-O
compound.Select an element from Periodic Table with
electronic properties similar to La, and substitute it with this elementin La-Ba-Cu-O
under the relevant
experimentalconditions.
Outputs:Substitution of Yfor La in La-Ba-Cu-O,and the discovery of Y-Ba-Cu-O
superconductor.
SecondaryResults: Othersubstitutions maybe possible to yield better oxide superconductors.
Alternative Outputs: < to be completed>
TheoryDevelopment:
Thehypothesisthat substanceswith highest Tcsare metal alloys wasfalsified oncemore.
Explanations
Typesof Explanations:Therole of crystal structure in oxide superconductivitywasdiscussed. Explanations
were based on electron-phononinteractions.tion with predicate logic representation. Eachrepresentation
schemehas its ownadvantagesand disadvantagesin
NewResearchProblems: Couldthere be other oxide compounds
with higher Tcs?
NewResearchDirections: Searchfor oxides with higher Tcs. Explainoxide superconductivity.
4.2. KnowledgeRepresentation Methods
Knowledge
representation still remain~to be an important issue in computational
models,as it affects the efficiency of a system’s methodsof search, learning and
discovery. Early models, (e.g. GLAUBER,
STAHL
and
BR-3)employrelatively simple representation methods
such as list structures and predicate expressions.Recent
discovery systems (e.g. AbE,COAST,
IDS) employmore
structured knowledgerepresentation schemes such as
framesand qualitative processschemas,often in combination with predicate logic representation.Eachrepresentation schemehas its ownadvantagesand disadvantagesin
terms of the implementation
(see, e.g., Kocabas,1991bfor
details). Therefore,the choiceof knowledge
representation schemesor their integration is an importantissue in
the de~i~mof computationalmodels. Consequently,the
systemdescription of a modelmustexplicitly state the
knowledgerepresentation methodsthat it employs,and
howthey are integrated.
142
4.3. The Order, Size, and the Role of Initial
Knowledge
4.5. Learning and Discovery Methods
Discovery systems utilize deductive and inductive
Initially, the discovery systems were divided into two
methods, but until now, there is no discovery modelthat
broad groups as data- and theory-driven systems. Later
uses analogical reasoning in a non- trivial sense. Logicoon, the distinction beganto appear as superficial, for some mathematical, formal and theoretical discovery systems
systems (e.g. STAHLp
and BR-3) start as data driven
such as AM, EURISKO, PI, GALILEO and ECHO
models and acquire theory-driven system characteristics
extensively rely on deductive methods, while BACON
during their operations. The size of initial knowledgeand
employs inductive methods. Systems like STAHL,
how muchof it is utilized by a discovery system is an
STAHLp,BR-3, KEKADA
and IDS employ both indueimportant feature in the correct evaluation of that system.
tire and deductive methods. A system’s discovery methods
Somesystems process data incrementally (e.g., STAHLp, cannot be separated from its search and theory revision
BR-3, AbE, COAST),and the order of data given to the
methods.
systemaffects its behavior (see, e.g., Koeabas,1991a).
the discovery modelis an incremental system, its descrip4.6. Generality of Methods
tion must evaluate the effects of data order. Data size is
Anotherimportant metric in the evaluation of a discovery
important for the evaluation of any discovery modelto test
modelhas been the generality of its discovery and search
the effectiveness of its search methods.
methods. Some discovery models such as EURISKO
and
BACON
rely on rather general heuristics for their dis4.4. Theory Revision and Search Methods
coveries. Similarly, BR-3 employs algebraic rules to
One of the prominent problems that haunt discovery
reduce its search space. However,there seemsto be a limit
systems with large search spaces is the control of search.
for the uses of such general heuristics, as systems with
Whatever search methods are used, the size of the
more and structured domain knowledge must inevitably
effectively used initial knowledgebase is a significant
use domainheuristics for constraining search. Therefore,
indicator of the system’s dimensions. Modelswith large
the size and the type of the discovery model must be
search spaces utilize a numberof search control methods.
considered in evaluating the generality of a system’s
These can be as widely varied as logical constrains (as in
methods.
STAHLp), algebraic
constraints
(as in BR-3 and
GALILEO), general rules (as in EURISKO,BACON, 4.7. Predictive Abilities
KEKADA
and IDS), and domain constraints (as in BR-3,
Predictive ability can be defined as a system’s ability to
KEKADA, AbE, COAST and GENSIM/HYPGENE).
generate a set of propositions which were undecidable
The description of a computational model must include
prior to the discovery are decidable afterwards. Predictive
its search methods, and explain why those particular
ability is an importantfeature of theoretical and empirical
methodsare used rather than the others.
systems. Doesthe system’s predictive ability improveas it
Theoryrevision is becomingan indispensable feature of
discovers newconcepts, hypotheses or relationships? The
discovery models. This is in line with the understanding
answer to this question is also an indication of how
that most scientific discoveries are the results of generateffectively the system integrates and uses the knowledge
ing and testing hypotheses. If the discovery system has
that it has discovered. Discoveries of systems like BACON
theory revision capabilities, first these must be described
and GALILEO
are validly appficable to an indefinite
in detail in general terms, and then explained with a
numberof physical states. However,by themselves, these
particular exampie. Also, where, howand whythe system’s
systems do not apply their knowledgeto physical states.
search and theory revision methods fail need also be
IDS, FAHRENHEITand BR-3, on the other hand,
explained. Artificial data can be used in testing the
effectively utilize the knowledgethey discovered in new
effectiveness of a system’s theory revision and search
problem states.
methods.
143
Lenat, D.B.(1979). Onautomatedscientific theory formation:A case
5. Conclusion
study using the AMprogram.In J. Hayes,D. Michieand LI. Miknlich
Computational
modelingof scientific discoveryhas been
emerging
as an importantresearchareain artificial intelligence, andthe numberof computationalmodelsis
steadily increasing.A methodology
for systematicevaluation of thesesystemsis necessary,not onlyfor researchers
in this field, butalso for the interestedphilosophers
and
historiansof science.First of all, a methodology
needsto
be developedfor buildin~ historical discovery models.
Secondly,a methodof classifcafion for suchmodelsfor
a systematicevaluationis needed.Thena set of evaluation
criteria needs to be identified, whichcan include the
researchgoals, knowledgerepresentationmethods,the
role of initial knowledge,theory revision andsearch
methods,learninganddiscoverymethods,generalityand
the system’spredictiveabilities. In this paperwehave
discussed these issues andprovidedexamplesfor the
methodsto be used.
(Eds.) MachineIntellligence 9, (251-283).NewYork:Halstead.
References
geologicalrevolution. In: J. Shragerand P. Langley(Eds.) Computational Modelsof Scientific Discoveryand TheoryFormation. Morgan
Lenat, D.B. (1983). EURISKO:
A programthat learns newheuristics
anddomain
concepts.Artificial Intelligence21, Nos.1-2, (61-98).
Nordhausen,B. &Langley,P. Towardsan integrated discoverysystem.
Proceedingsof the TenthInternational Joint Conferenceon Artificial
Intelligence,198-200.
O’Rorke,P., Morris, S. &Schulenburg,D. (1990). Theoryformation
by abstraction. In: J. Shragerand P. Langley(Eds.) Computational
Models of Scientific
Discovery and Theory Formation. Morgan
Kaufmann,San Mateo, CA.
Rajamoney,
S.A. (1990). A computationalapproachto theory revision.
In: J. ShragerandP. Langley(FAs.) Computational
Modelsof Scientific
Discoveryand TheoryFormation. MorganKaufmann,San Mateo, CA.
Thagard,P. &Holyoak,K. (1985). Discoveringthe wavetheory
sound: Inductive inference in the context of problemsolving. Proceedings of the Ninth International Joint Conferenceon Artificial
Intelligence,610-612.
Thagard, P. &Nowak,G. (1990). The conceptual structure of the
Darden,L (1987). Viewingthehistoryofscienceas compiledhindsight.
Kaufmann,San Mateo, CA.
TheAI Magazine8, No. 2, (33-42).
P. Langley(Eds.) ComputationalModelsof Scientific Discoveryand
Valdes-Perez, 11. (1992). Theory..driven discovery of reaction
pathways in the MECHEM
system. Proc. of the Tenth National
Theory Formation. MorganKaufmann,San Marco, CA.
Conferenceon Artificial Intelligence (pp. 63-69). San Jose, CA:AAAI
Karp,P.D. (1990). Hypothesisformationas design. In: J. Shragerand
Press.
Kocabas, S. (1989). Functional Categorization of knowledge:
Valdes-Perez,R. (in press). Discoveryof conservedproperties
Applicationsin modelingscientific discovery. PhDThesis, Department
particle physics: A comparisonof two models. MachineLearning.
Zytkow, J. (1987). Combiningmanysearches in the FAHRENHEIT
of Electronic and Electrical Engineering, King’s College London,
University of London.
Kocabas,S. (1991a).Conflictresolutionas discoveryin particle physics.
discovery system. Proceedingsof the Fourth International Workshop
MachineLearning,6, 277-309.
Kocabas,S. (1991b). A reviewoflearning. TheKnowledge
Engineering
on MachineLearning, Los Altos, CA:MorganKaufmann,281-287.
Zytkow,J. (1990). Derivinglaws throughanalysis of process and
Review,
6, 3.
equations. In: J. Shragerand P. Langley(Eds.) Computational
Models
of Scientific Discoveryand TheoryFormation.MorganKaufmann,San
Kocabas,S. (1991c).Computational
modelsof scientific discovery.The
Mateo, CA.
Knowledge
EngineeringReview,6, 259-305.
Kocabas,S. (1992a). Functional categorization of knowledge.AAAI
Spring Symposium
Series, 25-27March1992, Stanford, CA.
Zytkow,J. &Simon,H.D.(1986). A theoryof historical discovery:The
construction of componentialmodels.MachineLearning,I, 107-137.
Kocabas,S. (1992b). Fourlevels of learning and representation
modelingscientific discovery. First Turkish Symposium
on AI and
References on Superconductivity
Khurana,A. (1987a). Searchand discovery: Superconductivityseen
Neural Networks,25-26June, Bilkent, Ankara.
Kulkarni, D. & Simon,H.D. (1988). Theprocesses of scientific
discovery.CognitiveScience,12, 277-309.
Langley, P., Simon,H.A., Bradshaw,G.L., Zytkow,J.M. (1987).
Scientific discovery:.Computational
explorationsof the creative proce.s.ses. Cambridge,MA:TheMITPress.
abovethe boiling point of nitrogen. PhysicsToday,April, 1987,17-23.
Khurana,A. (1987b).Searchand discovery: Bednotzand Muellerwin
NobelPrize for newsuperconductingmaterials. PhysicsToday,December, 1987,17-19.
144
Download