The Ontology of Paleobiology Mathias Brochhausen Institute of Formal Ontology and Medical Information Science Saarland University, Saarbruecken, Germany Paleobiology ontology “tour guide” What is going on in paleobiology? What is (an) ontology? What is going on in biomedical ontologies? Let‘s get started. What is paleobiology? • Paleobiology (sometimes spelled palaeobiology) is a growing and comparatively new discipline which combines the methods and findings of the natural science biology with the methods and findings of the earth science paleontology. • Wikipedia, 09 July 2009 What are the subdisciplines? Paleobotany Paleozoology Paleoanthropology Paleoecology Taphonomy Evolutionary developmental paleobiology Why do we need ontologies in paleobiology? In order to make comparative studies both across time - e.g.in paleoecology - and space -e.g. in evolutionary developmental paleobiology, and especially across paleobiology and recent data. Data in paleobiology are extremely sparse. • Note that this is not a number for paleobiolo-gical specimens, but for prehistorical ones. We expect the number for paleobiology to be even smaller. Time in paleobiology 3 500 000 000 B.P.: Oldest Stromatolite fossils 7 000 000 B.P.:Oldest possible hominine fossil 160 000 B.P.:Oldest Homo sapiens idaltu State of the art: What is going on with respect to data collections for paleobiology? What is going on with respect to biological ontologies? http://www.ucmp.berkeley.edu/pdn/pdnhomelinks.htm http://paleodb.org/cgi-bin/bridge.pl Databases from Delson et al., 1 Primate Morphology Online, PRIMO Human Origins Database, HUD Smithsonian Paleoanthropology Database Revealing Human Origins Initiative, RHOI Neanderthal Studies Professional Online System, NESPOSAncient Human Occupation of Britain, AHOB digital@rchive for Fossil Hominoids Databases from Delson et al., 2 European Virtual Anthropology Network, EVAN Siwalik Database Project Neogene Old World Mammals, NOW Knowledge-based Archaeological Data Integration System, KADIS Transvaal Museum Database National Museum of Kenya Database, NMK Databases from Delson et al., 3 National Museum of Kenya Database, NMK Institute of Vertebrate Paleontology and Paleoanthropology Site Database, IVPP AMNH Vertebrate Zoology Catalogue Paleoportal The situation regarding paleobiology relevant databases: There already exists a huge amount of distributed data. Some of the databases are extremely restricted in coverage, e.g. HUD. Others are restricted regarding their domain. This will cause problems with respect to crossdisciplinary studies. What is an ontology? • Ontology is concerned with categorizing the elements of reality. What is an ontology? Ontology as a branch of philosophy is the science of what is, of the kinds and structures of the objects, properties and relations in every area of reality. In simple terms it seeks the classification of entities (B. Smith). What is an ontology? An ontology is a formal explicit specification of a shared conceptualization (R. Studer et al.). What is an ontology? An ontology is a formal explicit specification of universals in reality and the relations existing between these universals. The entities can be viewed from different perspectives. Ontologies provide reference for multiple sources of data. The aim is to foster semantic integration of data stored in separate sources. http://www.obofoundry.org/ What is the OBO Foundry? • The OBO Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain. The groups developing ontologies who have expressed an interest in this goal are listed below, followed by other relevant efforts in this domain. The OBO Foundry and ontology evaluation • The OBO Foundry provides one means to ensure high quality in ontology development. • The principles of the OBO Foundry foster distributed development according to best practice. OBO Foundry ontologies of interest to paleobiology: Environment Ontology Common Anatomy Reference Ontology Mammalian Phenotype Ontology Phenotypic Quality Ontology Gene Ontology The situation regarding biological ontologies: The number of ontologies for the biological and biomedical arena is growing daily. Ontologies specifically adressing paleobiological issues are lacking in the OBO Foundry. Case Study Paleoanthropology An important ontological ressource with respect to paleoanthropology is the Foundational Model of Anatomy (http://sig.biostr.washington.edu/projects/fm), which is a member in the OBO Foundry. Cranial measurement points that are commonly used in Physical Anthropology are already in the FMA. The CIDOC-CRM ISO 21127:2006 http://cidoc.ics.forth.gr/index.html CIDOC Conceptual Reference Model ...provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation. ...provides a semantic framework for sharing information on cultural heritage. CIDOC Conceptual Reference Model ...provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation. ...provides a semantic framework for sharing information on cultural heritage. We need the means to compare paleobiological data with recent biological evidence. Case Study Paleoanthropology: • Physical anthropology is the science of human variability in space and time. Let‘s get started. A paleobiology ontology toolkit: Decide about ontology format and editor. Decide about Upper Ontology. Survey the domain. Identify the tough ontological questions. Ontology languages Web Ontology Language, OWL Open Biological Ontologies, OBO For details on other languages see GomézPérez et al. (2004) Ontological Engineering, Springer, London, Berlin, Heidelberg. OWL sublanguages OWL Lite OWL DL OWL Full http://www.w3.org/TR/owl-features/ Ontology editors Protégé (http://protege.stanford.edu) OBO-edit (http://oboedit.org) many more, both, commercial and open source What is an Upper Ontology? • An upper ontology is limited to concepts that are meta, generic, abstract and philosophical, and therefore are general enough to address (...) a broad range of domain areas. Concepts specific to given domains will not be included; however, this standard will provide a structure and a set of general concepts upon which domain ontologies (...) could be constructed (http://suo.iee.org). Examples for Upper Ontologies Suggested Upper Merged Ontology SUMO (http://suo.ieee.org/SUO/SUMO/index.html) Basic Formal Ontology BFO (http://www.ifomis.org/bfo) Descriptive Ontology for Linguistic and Cognitive Engineering DOLCE (http://www.loacnr.it/DOLCE.html) Why should we use an Upper Ontology? Using an Upper Ontology fosters subsequent harmonisation with other pre-exisiting ontologies, for instance in the OBO Foundry. The existence of an Upper level supports ontology evaluation. Why should we use an Upper Ontology? • But most of all: Starting from an Upper Level helps to stay clear from epistemological considerations. It provides the right, ontological frame of mind. Basic Formal Ontology philosophically sound Upper Ontology tested for biomedical and topographical ontology development developed by P. Grenon and B. Smith OWL-implementation by H. Stenzhorn BFO: The basic divide bfo:Entity snap:Continuant span:Occurrent http://www.ifomis.org/bfo Continuant Independent Continuant Material Object Object Fiat Object Part Object Aggregate Object Boundary Site Dependent Continuant Quality Realizable Entity Disposition Function Role Information Object http://www.ifomis.org/bfo Occurent Processual Entity Process Fiat Process Part Process Aggregate Process Boundary Processual Context Spatiotemporal Region Temporal Region A central problem: bfo:Entity snap:Continuant span:Occurrent Top down Bottom up Strategy Start the ontology development process with building a sound hierarchy. Make sure to exclusively use formal is_a relation in the hierarchy. Stay clear of multiple inheritance. Formal is_a • Given classes/types/universals A and B • A is a proper subclass/subtype/subuniversal of B • if and only if all members of A are members of B and A is not equal to B Case Study paleoanthropology Instances of material objects: ...and paleobiology? Getting some terms straight: fossil - “something obtained by digging up”. Used for both fossilised material and nonfossilised material to fossilise - to turn into stone, biomaterial replaced with mineral substances preserving the form. Searching for paleobiological evidence we find: biological substrate mineralised morphologies trace fossils • It is important to note that these different types of specimens are kept separat in the ontology. • Especially since the differences lead to differences in the kind of biological information we may derive from them. Bones potentially give us full biological information, including histology and genetics. Stones conserve some biological features, especially the morphology. Given the growing importance of molecular methods in paleobiology this distinction becomes more and more important. Starting with a Middle Ontology for Paleobiology span:Object OrganicObject AnorganicObject The organic-anorganic distinction in paleobiology ontology Organic Objects are results of biological processes. Anorganic Objects are not results of biological processes. Note: The organic-anorganic distinction in paleobiology ontology differs considerably from the same distinction in chemistry. The artefact problem Introducing “Taphonomy” The term stems from the greek word for “burial.” Refers to the scientific study of the decay and fossilisation of (former) organisms. Reference: Shipman P (1981) Life History of a Fossil. An Introduction to Taphonomy and Paleoecology, Cambridge/Mas. Influences creating artificial results in paleobiology Artefact Geofact Biofact Artefact Biofact An object that has An object that has been changed by been changed by non-human, human influence biologi-cal (intentionally). influence. Geofact An object that has been changed by geological influence. Artefact Biofact An object that has An object that has been changed by been changed by non-human, human influence biologi-cal (intentionally). influence. Geofact An object that has been changed by geological influence. Physical Thing is a Man-Made Thing is a Physical Man-Made Object Examples for man-made thing: Beethoven’s 5th Symphony Michelangelo’s David Einstein’s Theory of General Relativity The taxon Fringilla coelebs Linnaeus, 1785 Starting with a Middle Ontology for Paleobiology span:Object OrganicObject AnorganicObject What about an artefact consisting of: a human skull clay human hair some shells? Is it an object aggregate? MaterialEntity Object ObjectAggregate Is it an object aggregate? Definition: A material entity [snap:MaterialEntity] that is a mereological sum of separate object [snap:Object] entities and possesses non-connected boundaries. Examples: a heap of stones, a group of commuters on the subway, a collection of random bacteria, a flock of geese, the patients in a hospital. MaterialEntity Object OrganicObject ObjectAggregate CombinedObject AnorganicObject Combined Object Are by definition composed of proper parts some of which are organic objects and some of which are anorganic objects. From this follows that we need a property (relation) in our ontology linking proper parts to the objects they are proper parts of. For now, we do not need to address the problem whether combined objects that are not artefacts exist. Properties/Relations for the paleobiology ontology Representing relations beyond the is_a relation is one of the chief assets of ontologies against taxonomies. The paleobiology ontology ought to be oriented on biological evidence since comparative studies with recent biology constitute one of the main motives. Introducing: Relation Ontology (RO) RO contains core relations used in the OBO Foundry ontologies. Formal definitions for the relations are given. RO can be imported into any OWL ontology. Mineralised morphologies Fossilised specimens contain information about morphologies of past organisms. Morphology is the form of something. Information on forms can be given in 3DModels both virtual and real based on either making a cast or executing exact measurements. GenericallyDependentContinuant InformationObject Shape Searching for paleobiological evidence we find: amount of biological information biological substrate mineralised morphologies trace fossils Getting some terms straight: fossil - “something obtained by digging up”. Used for both fossilised material and nonfossilised material to fossilise - to turn into stone, biomaterial replaced with mineral substances preserving the form. SpecificallyDependentContinuant RealizableEntity Disposition Function Quality Role Fossil Biological Process Taphonomic Process Curation/Research Process propagate breath die feed excrete decay fossilize to be scattered to be altered by intention recovery conservation measurement DNA extraction Occurrent ProcessualEntity Process BiologicalProcess TaphonomicProcess Curation/Research Additional discussion: What are species? In most paleobiological subdisciplines species play a major role. Ontologically we need to distinguish between the status of “species” and individual species. MaterialEntity Object ObjectAggregate SpecificallyDependentContinuant RealizableEntity Disposition Function Quality Role Conclusions Building a paleobiology ontology requires to keep track of ontological issues not commonly found in other biological ontologies. Keeping apart the subjects of research and the research process is far more difficult than in other biological disciplines. if you have any questions, comments or in case you want to cooperate in making paleobiolgy ontology fit for the OBO Foundry please contact me: •mathias.brochhausen@ifomis.uni-saarland.de