Ontology: Not Just for Philosophers Anymore Robert Arp, Ph.D. - The Ontology Research Group (ORG) www.org.buffalo.edu - The National Center for Biomedical Ontology (NCBO) www.bioontology.org Special thanks to Barry Smith and Werner Ceusters for comments and material from articles, books, and presentations. This work was funded by the National Institutes of Health through the NIH Roadmap for Biomedical Research, Grant 1 U 54 HG004028. Information on the National Centers for Biomedical Computing can be found at: http://nihroadmap.nih.gov/bioinformatics. Three Parts To Talk: I: Meanings of ‘Ontology’ II: Basic Formal Ontology (BFO) III: The Vision and Mission of the Ontology Research Group (ORG) Part I: Meanings of ‘Ontology’ (1) Philosophical Ontology “I can fit wholesale evolution and a creating god into my ontology without contradiction.” “Just because it has mental existence doesn’t mean it has ontological existence.” (2) Domain Ontology “I’m working on an ontology for annelids.” “The Gene Ontology has data on that HOX gene.” (3) Formal Ontology “This upper level ontology should help organize these domains.” “IEEE just came out with the latest version of SUO that may solve some of these problems.” (1) Philosophical Ontology - Ontos (being, existence)+ Logos (word, account, explanation) - The study of what is, of the kinds and structures of objects, properties, events, processes, and relations in every area of reality. - Theoretical discipline concerned with accurately describing the taxonomy of all things that exist according to underlying entities and principles that make things: A) BE what they are. B) BE KNOWN AS what they are. - Synonymous with classical Metaphysics. THING MATERIAL SUBSTANCE IMMATERIAL SUBSTANCE ANIMATE (Living) ENTITY NON-ANIMATE ENTITY LIVING ENTITY WITH SENSATION (ANIMAL) LIVING ENTITY W/OUT SENSATION (VEGETATION) RATIONAL ANIMAL NON-RATIONAL ANIMAL HUMAN E.G., Plato Aristotle Dr. Sucheston Dr. Arp PORPHYRIAN TREE Cf. Linnean Taxonomy and The Periodic Table To a certain extent, all of us are Philosophical Ontologists in that we naturally and automatically categorize any and all things in reality so as to understand, explain, control, dominate, and navigate reality. Different Schools / Approaches to Philosophical Ontology • We can’t know reality because we can’t get beyond our sensations, perceptions, and/or ideas of reality (Idealism) • We can only know the theories, languages, concepts, or systems of beliefs about reality, and reality is what minds make it (Antirealism) • We can know reality “out there” as a world beyond our minds, and reality, ultimately, is in no way (e)affected by our minds (Realism) • Reality is one kind of thing: all mind (Mental Monism) • Reality is one kind of thing: all matter (Material Monism) • Reality is two kinds of things: mind and matter (Dualism) • Only the Bible accurately depicts reality (Fundamentalism) • Only science accurately depicts reality (Scientism) (2) Domain Ontology - Representation of the entities and relations existing within a particular domain of reality such as medicine, geography, ecology, or law, e.g., GO, FMA, EnvO. - Opposed to ontology in the philosophical sense, which has all of reality as its subject matter. - Ideally, provides a controlled, structured vocabulary to annotate data in order to make it more easily searchable by human beings and processable by computers. - Synonymous (for some) with ‘Reference Ontology.’ - ‘Task’ or ‘Application’ Ontology: runs, uses, exploits a domain ontology. AN ONTOLOGY (Ontology Research Group): “a representational artifact, comprising a taxonomy as its main part, whose representational units are intended to designate some combination of universals, defined classes, and certain relations between them.” * E.G., The Gene Ontology (GO) The Foundational Model of Anatomy Ontology (FMA) The Environment Ontology (EnvO) * Smith et al., “Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain,” Proc KRMed 2006: http://ontology.buffalo.edu/bfo/Terminology_for_Ontologies.pdf A REALISM-BASED ONTOLOGY: “is built out of representational units which are intended to refer exclusively to (real) universals, and corresponds to that part of the content of a scientific theory that is captured by its constituent general terms and the interrelations between the universals denoted by these terms.” * (again, ORG definition) Contrasted with: - Idealism-Based Ontology - Antirealism-Based Ontology * Smith et al., “Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain,” Proc KRMed 2006: http://ontology.buffalo.edu/bfo/Terminology_for_Ontologies.pdf Informatics: The science of information collection, categorization, management, storage, processing, retrieval, and dissemination. (Arp’s rendition) Bioinformatics: “A discipline of quantitative analysis of information relating to biological macromolecules with the aid of computers.” Jin Xiong, Essential Bioinformatics (Cambridge University Press, 2006), 3. “…developed in the space occupied with mathematical and computational biology, biometry and biostatistics, computer science, cybernetics, molecular evolution, genomics and proteomics, genetics, and molecular and cell biology.” Polanski and Kimmel, Bioinformatics (Verlag: Springer, 2007), 2-3. Domain ontology is contrasted with: - Database: stores data of ontology or whatever info. - Rule-based Language (e.g., XSD): tells you how to store, control, and describe an ontology or whatever info. - Thesaurus: taxonomy coupled with relations - Taxonomy: terms and glosses organized into subsumed hierarchical relations - Glossary: catalogue of glosses (translations) in a language - Catalogue: set of terms with meanings - Inventory: checklist of items, terms, entities - Axiomatic Theory: formal system with clear rules and semantics However, it is arguable that an Ontology can be characterized as a hybrid of a Taxonomy and an Axiomatic Theory. Example Ontology BORROWED FROM: http://www.bio.davidson .edu/courses/genomics/2 006/martens... 3DN A Gene Ontology Example The Information Age: A Sea of Information - Varying perspectives, methodologies, ideas, and… DATA - More information than humans can handle - Extraordinary depth, magnitude, and… CHAOS - Plenty of human error RESULT: - More DOMAINS that are non-interoperable, non-communicative, isolated, insolated , encapsulated “silos” of information - Lost at sea? In the sea? Genetics Diseases Ecology Evolution Primatology Cardiology Informatics Problems that Contribute to Being Lost at Sea: - Dumb Beast - Nonsense-In-Nonsense-Out - Computer Solipsism - Human Idiosyncrasy - Tower of Babel - Pressures from Insurance Companies - Legal Pressures - Human Error: Incorrect Thinking (IT) IT: Simply Getting the Facts Wrong * FROM GO, SNOMED, BRIDG, and UMLS (1) “extracellular region is_a cellular component” (2) “extrinsic to membrane part_of membrane” (3) ‘derives from’ confused with ‘develops from’ (4) “both testes is_a testis” (5) Animal =Def. “A non-person living entity…” (6) “An ontology is the same thing as a database…” (7) “An ontology is just a taxonomy…” * N.B. It may be the case that the examples of IT used in this presentation have been resolved. No matter, (sadly) there are legion examples of IT to be found. IT: Lack of Clear and Coherent Definitions FROM NCIT, BRIDG, and SNOMED: (1) Try and Define: Cancer, Gene, Neuropathy, Disease, Infectious Disease, Bios Itself... admittedly difficult. (2) Disease Progression =Def. “Cancer that continues to grow and spread,” and “Increase in size of tumor…,” and “The worsening of a disease over time” (3) Person =Def. “Human being” (4) “European is_a ethnic group” (5) “Other European in New Zealand is_a ethnic group” (6) “Mixed ethnic census group is_a ethnic group” IT: Circular Definitions FROM GO and BRIDG (1) Hemolysis of red blood cells =Def. “The processes by which an organism effects hemolysis” Cf. Filtration of kidneys =Def. “The processes by which an organism effects filtration (of kidneys)” (2) Ingredient =Def. “A substance that acts as an ingredient within a product. Note that ingredients may also have ingredients. (3) Protection from natural killer cell mediated cytolysis =Def. “The process of protecting a cell from cytolysis by natural killer cells” IT: Examples Instead of Definitions FROM BRIDG (1) Adverse Event =Def. Basic Mistakes in Definitions: 101 See Plato’s Euthyphro. “Holiness is what I’m doing in prosecuting my father…” (a) “toxic reaction”… (b) “…untoward occurrence in a subject At least one reason why administered a pharmaceutical product…” we need Philosophers? (c) “An unfavorable and unintended reaction, symptom, syndrome, or disease encountered by a subject on a clinical trial…” (2) Defeasibility =Def. “a line of communication that is terminated,” “boundaries for software” IT: Use-Mention Confusion FROM BIRN, MeSH, NCIT, and HL7 (1) Mouse =Def. “Name for the species Mus musculus” (2) “National Socialism is_a MeSH Descriptor” (3) Conceptual Entities =Def. “An organizational header for concepts representing mostly abstract entities” (4) Animal =Def. “a subtype of Living Subject representing any animal-of-interest to the Personnel Management domain” (5) “living subject is_a code system ” IT: Conception/Perception vs. Reality Confusion FROM NCIT and UMLS (1) Living subject =Def. “An object representing an organism” (2) Class performed activity =Def. “The description of applying, dispensing or giving agents or medications to subjects” (3) Adverse Event =Def. “An observation of a change in the state of a subject that is assessed as being untoward…” (4) Objective Result =Def. “An act of monitoring, recognizing and noting reproducible measurement…” (5) “Individual allele is_a act of observation ” (6) “Cancer documentation is_a cancer” (7) “Bacterium causes experimental model of disease” Lost at Sea Lost in the Sea Domain Ontology e.g., genetics Domain Ontology diseases Domain Ontology ecology Domain Ontology evolution Domain Ontology primatology Domain Ontology cardiology (3) Formal Ontology… Salvation - A discipline which assists in making communication between and among domain ontologies possible by providing a common language and common formal framework for reasoning. “The fundamental role of an ontology is to support knowledge sharing and reuse.” J. Domingue and E. Motta, "A knowledge-based news server supporting ontologydriven story enrichment and knowledge retrieval," in Knowledge Acquisition, Modeling and Management: 11th European Workshop, EKAW 99 Proceedings, ed. D. Fensel and R. Studer (Berlin: Springer, 1999), 104. (3) Formal Ontology… Salvation - Concerns, at least: (a) adoption of a set of basic categories of objects (b) discerning what kinds of entities fall within each of these categories of objects (c) determining what relationships hold within and amongst the different categories in the domain ontology. - Relies on philosophical ontology (thus, people like Smith, Ceusters, Goldberg, Arp and others in the Ontology Research Group doing this work). (3) Formal Ontology… Salvation - Synonymous (for some) with ‘upper level,’ ‘higherlevel,’ ‘top-level,’ ‘backbone,’ ‘general,’ ‘generic,’ ontology. - Applied in bioinformatics, intelligence analysis, management science, and in other scientific and business fields, where it serves as a basis for the improvement of classification, information organization, and automatic reasoning… helping to navigate the sea of information. (3) Formal Ontology… Salvation EXAMPLES: (a) SUO Standard Upper Ontology (b) DOLCE Descriptive Ontology for Linguistic and Cognitive Engineering (c) BFO Basic Formal Ontology Formal Ontology is like a “backbone” or “spine” making communication, interoperability, and optimal dissemination of information possible between and among domain ontologies. Domain Ontology e.g., genetics Domain Ontology diseases Domain Ontology ecology Domain Ontology evolution Domain Ontology primatology Domain Ontology cardiology Domain Ontology primatology Domain Ontology cardiology Formal Ontology e.g., BFO From This To This Domain Ontology e.g., genetics Domain Ontology diseases Domain Ontology ecology Domain Ontology evolution RESULT: No longer lost at sea or lost in the sea of information, biomedical or otherwise. Formal Ontology e.g., BFO Domain Ontology e.g., genetics Domain Ontology diseases Domain Ontology ecology Domain Ontology evolution Domain Ontology primatology Domain Ontology cardiology Part II: Basic Formal Ontology (BFO) BFO: How Does It Work? General Preliminaries - Formal: “applicable to all domains of objects...” Barry Smith and David Woodruff Smith, The Cambridge Companion to Husserl, ed. Barry Smith and David Woodruff Smith (Cambridge: Cambridge University Press, 1995), 28. - Relevancy - Perspectivalism - Granularity - Fallibility REALISM-BASED ONTOLOGY (a) Universals (1) Real Things or Continuants SNAP shots of reality (2) Real Processes or Occurrents SPAN of time (b) Relations (which are also universals of a different type) SNAP Continuant Universal: SNAP Object Relation: is_a Independent Dependent Spatial Continuant Continuant Region Generically Specifically Zero Dependent Dependent Dimensional Continuant Continuant Region Object Boundary Quality Realizable One Entity Dimensional Region Object Aggregate Disposition Two Fiat Object Part Site Dimensional Function Role Region Three Dimensional Region SNAP Continuant Example: Independent Dependent Spatial Continuant Continuant Region Object Human Heart Generically Specifically Zero Dependent Dependent Dimensional Continuant Continuant Region Object Boundary Surface of the Heart Object Aggregate All Hearts in This Room Fiat Object Part A Biopsy of the Heart Quality Realizable One Entity Dimensional Pink, Smooth Region Disposition Stops if No Circulation Two Dimensional Function Pumps Blood Role Site Chest Cavity Region Three Dimensional Region SPAN Universal: SPAN Process Relation: is_a Occurrent Processual Spatiotemporal Temporal Entity Region Region Scattered Connected Scattered Connected Spatiotemporal Spatiotemporal Temporal Temporal Region Region Region Region Process Boundary Spatiotemporal Temporal Instant Instant Spatiotemporal Temporal Interval Interval Process Aggregate Fiat Process Part Processual Context SPAN Example: Occurrent Processual Spatiotemporal Temporal Entity Region Region Process ECG (EKG) Test Scattered Connected Scattered Connected Spatiotemporal Spatiotemporal Temporal Temporal Region Region Region Region Process Boundary Start/End of ECG Process Aggregate Fiat Process Part Processual Context All ECGs in Clinic 2nd Lead Attached Test Context S/T ECG Began S/T Region of ECG Spatiotemporal Instant Spatiotemporal Interval Moment ECG Began Time Occupied Temporal Instant Temporal Interval BFO RESOURCES IFOMIS BFO Website: httpp://www.ifomis.uni-saarland.de/bfo/ Barry Smith’s Website: http://ontology.buffalo.edu/smith/ Barry Smith’s Articles: e.g., http://ontology.buffalo.edu/smith/ articles/SNAP_SPAN.pdf CONCRETE STEPS: (1) Explicitly demarcate the entities of domain ontology (2) Determine the universals and relations in domain (3) Concretize information in a representational artifact (4) Regiment the information to ensure: a) logical, philosophical, and scientific coherence b) compatibility with other relevant ontologies c) human intelligibility (5) Formalize in a computer tractable language (6) Implement in some specific computing context A LOT OF THIS ACTUALLY HAPPENS AT MEETINGS… The Countless Cs of Computational Categorization: From Cognizance To Coordination To Comfort Cognizance of Informatics Problems Cooperation of Researchers, Doctors… Conferences, Colloquia, Meetings… Clarity of Terms and Relations Cogency: Counter-Example Free? Coherency of Domain Ontologies Coordination of Domain Ontologies Computational Tractability Communicability of Information Coding of Information Correctly Convenience of Accessibility to Information Care of Humans/Animals (First, Do No Harm) Comfort of Humans/Animals COORDINATION OF DOMAIN ONTOLOGIES Part III: The Vision and Mission of the Ontology Research Group (ORG) THE ONTOLOGY RESEARCH GROUP (ORG): www.org.buffalo.edu The ORG currently has three sub-units: (1) The Ontology, Logic and Technology Unit (OLT) is engaged in foundational ontology research and content development, especially in the biomedical domain. (2) The Referent Tracking Unit (RTU) carries out applied research and software development pertaining to electronic health records and other data resources in the biomedical domain. (3) The Qualitative Spatiotemporal Reasoning Unit (QSR) is applying ontological techniques derived from qualitative spatiotemporal reasoning and the field of Geographic Information Systems in order to improve the representation of canonical anatomy, as well as the processing of X-ray, MRI, and other forms of image and signal data. ORG OLT RTU QSR The VISION of the ORG is to assist scientific researchers, especially biomedical researchers, in providing a single, cumulative, and algorithmically processable database of information in their respective scientific domains. The MISSION of the ORG is to realize this vision by supporting researchers in the creation and application of high-quality domain ontologies that enable efficient translational research and optimal clinical care. CONCRETELY, THIS MEANS: - ORG researchers are playing leading roles in a number of national and international ontology research consortia, and they have organized a wide variety of ontology training and dissemination events. - The ORG is constantly involved in the organization and participation of workshops, conferences, colloquia, and other events all around the world. See: http://org.buffalo.edu/rarp/Presentations.html ...AS WELL AS PROFFERING OF BFO COORDINATION OF DOMAIN ONTOLOGIES A Few ORG Collaborators: - NCBO http://bioontology.org/ - NCOR http://ncor.us/ - ECOR http://www.ecor.uni-saarland.de/home.html - OBO Foundry Project http://obofoundry.org/ - UB http://philosophy.buffalo.edu/contrib/graduate/areas_of_study/phd.shtml - IFOMIS http://www.ifomis.uni-saarland.de/ - RIDE http://www.srdc.metu.edu.tr/webpage/projects/ride/ - Industrial Collaborations Medtuity, Inc. http://www.org.buffalo.edu/RTU/indcollabs.html Sigmund Software http://www.sigmundsoftware.com/ Thank You Robert Arp, Ph.D. - The Ontology Research Group (ORG) www.org.buffalo.edu - The National Center for Biomedical Ontology (NCBO) www.bioontology.org Special thanks to Barry Smith and Werner Ceusters for comments and material from articles, books, and presentations. This work was funded by the National Institutes of Health through the NIH Roadmap for Biomedical Research, Grant 1 U 54 HG004028. Information on the National Centers for Biomedical Computing can be found at: http://nihroadmap.nih.gov/bioinformatics.