Introduction to Ontologies for Environmental Biology Barry Smith http://ontology.buffalo.edu/smith Finnegans Web concept type class instance model representation data process property 2 Disciplines here involved GIS Ecology Environmental biology Various -omics disciplines Bioinformatics Medical Informatics Database science Semantic webists ... Part 1: What is an Ontology? 4 what cellular component? what molecular function? what biological process? 5 natural language labels designed for use in annotations to make the data cognitively accessible to human beings and algorithmically tractable to computers 6 compare: legends for maps 7 common legends allow (cross-border) compare: legends for mapsintegration 8 ontologies are legends for data 9 compare: legends for diagrams 10 Ramirez et al. Linking of Digital Images to Phylogenetic Data Matrices Using a Morphological Ontology Syst. Biol. 56(2):283–294, 2007 computationally tractable legends help integrate complex representations of reality help human beings find things in complex representations of reality help computers reason with complex representations of reality 12 ontologies are used to annotate data but there are two kinds of annotations names of types 16 names of instances 17 A basic distinction type vs. instance science text vs. diary human being vs. Michael Ashburner 18 Catalog vs. inventory A B C 515287 521683 521682 DC3300 Dust Collector Fan Gilmer Belt Motor Drive Belt 19 Ontology types Instances 20 An ontology is a collection of standardized names for types We learn about types in reality from looking at the results of scientific experiments captured in the form of scientific theories Ontologies provide the terminological scaffolding of scientific theories experiments relate to what is particular science describes what is general 21 thing types organism animal mammal cat siamese frog instances 22 types vs. their extensions type {a,b,c,...} class of instances = a collections of particulars 23 Extension =def The extension of a type A is the class of instances of A (the class of all entities to which the term ‘A’ applies) 24 types vs. classes types {c,d,e,...} classes 25 types vs. classes types extensions ~ defined classes 26 Defined class =def member of Abba aged > 50 years pizza with > 4 different toppings red wine to serve with fish 27 Part 2: The OBO Foundry 28 what cellular component? what molecular function? what biological process? 29 The Gene Ontology The Gene Ontology Five bangs for your GO buck The Gene Ontology 1. based in biological science 2. cross-species data comparability (human, mouse, yeast, fly ...) 3. cross-granularity data integration (molecule, cell, organ, organism) 4. cumulation of scientific knowledge in algorithmically tractable form 5. links people to software 6. part of Open Biomedical Ontologies (OBO) 32 Entry point for creation of webaccessible biomedical data GO initially low-tech to encourage users Simple (web-service-based) tools created to support the work of biologists in creating annotations (data entry) OBO OWL DL converters now making OBO Foundry annotated data immediately accessible to Semantic Web data integration projects 33 The OBO Foundry A suite of high quality interoperable reference ontologies to serve the annotation of biomedical data providing guidelines for those who need to create new ontology resources http://obofoundry.org RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Organism Anatomical Organ (NCBI Entity Function Taxonomy) (FMA, CARO) (FMP, CPRO) Cell (CL) Cellular Component (FMA, GO) Molecule (ChEBI, SO, RnaO, PrO) Phenotypic Quality (PaTO) Biological Process (GO) Cellular Function (GO) Molecular Function (GO) Molecular Process (GO) The OBO Foundry building out from the original GO 35 Simple guidelines • • • • • use singular nouns distinguish continuants from occurrents distinguish things from their qualities distinguish types from their instances do not use the weasel word ‘concept’ CRITERIA OPENNESS: The ontology is open and available to be used by all. FORMAL LANGUAGE: The ontology is in, or can be instantiated in, a common formal language. ORTHOGONALITY: The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap. CONVERGENCE: The developers agree to work torwards a single ontology for each domain. http://obofoundry.org/ 37 CRITERIA UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. IDENTIFIERS: The ontology possesses a unique identifier space within OBO. VERSIONING: The ontology provider has procedures for identifying distinct successive versions. DEFINITIONS: The ontology includes textual definitions for all terms. http://obofoundry.org/ 38 CRITERIA CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content. DOCUMENTATION: The ontology is well-documented. USERS: The ontology has a plurality of independent users. COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology. http://obofoundry.org/ 39 Foundry ontologies all work in the same way all are built to represent the types existing in a preexisting domain and the relations between these types in a way which can support reasoning – we have data – we need to make this data available for semantic search and algorithmic processing – we create a consensus-based ontology for annotating the data – and ensure that it can interoperate with Foundry ontologies for neighboring domains 40 Formal-Ontological Relations is_a part_of located_at depends_on is_boundary_of adjacent_to 41 To support integration of ontologies relational expressions such as is_a part_of ... should be used in the same way in all ontologies involved 42 to define these relations properly we need to take account of both types and instances in reality 43 Kinds of relations <instance, type>: Toronto instance_of city <instance, instance>: Toronto part_of Ontario <type, type>: waterfall part_of river 44 is_a human is_a mammal all instances of the type human are as a matter of necessity instances of the type mammal 45 Ontology Scope URL Custodians Cell Ontology (CL) cell types from prokaryotes to mammals obo.sourceforge.net/cgibin/detail.cgi?cell Jonathan Bard, Michael Ashburner, Oliver Hofman Chemical Entities of Biological Interest (ChEBI) molecular entities ebi.ac.uk/chebi Paula Dematos, Rafael Alcantara Common Anatomy Reference Ontology (CARO) anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland, Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.washington. edu JLV Mejino Jr., Cornelius Rosse Functional Genomics Investigation Ontology (FuGO) design, protocol, data instrumentation, and analysis fugo.sf.net FuGO Working Group Gene Ontology (GO) cellular components, molecular functions, biological processes www.geneontology.org Gene Ontology Consortium Phenotypic Quality Ontology (PaTO) qualities of biomedical entities obo.sourceforge.net/cgi -bin/ detail.cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos Protein Ontology (PrO) protein types and modifications (under development) Protein Ontology Consortium Relation Ontology (RO) relations obo.sf.net/relationship Barry Smith, Chris Mungall RNA Ontology (RnaO) three-dimensional RNA structures (under development) RNA Ontology Consortium Sequence Ontology (SO) properties and features of nucleic sequences song.sf.net Karen Eilbeck 46 Ontology Scope URL Custodians Cell Ontology (CL) cell types from prokaryotes to mammals obo.sourceforge.net/cgibin/detail.cgi?cell Jonathan Bard, Michael Ashburner, Oliver Hofman Chemical Entities of Biological Interest (ChEBI) molecular entities ebi.ac.uk/chebi Paula Dematos, Rafael Alcantara Common Anatomy Reference Ontology (CARO) anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland, Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.washington. edu JLV Mejino Jr., Cornelius Rosse Functional Genomics Investigation Ontology (FuGO) design, protocol, data instrumentation, and analysis fugo.sf.net FuGO Working Group Gene Ontology (GO) cellular components, molecular functions, biological processes www.geneontology.org Gene Ontology Consortium Phenotypic Quality Ontology (PaTO) qualities of biomedical entities obo.sourceforge.net/cgi -bin/ detail.cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos Protein Ontology (PrO) protein types and modifications (under development) Protein Ontology Consortium Relation Ontology (RO) relations obo.sf.net/relationship Barry Smith, Chris Mungall RNA Ontology (RnaO) three-dimensional RNA structures (under development) RNA Ontology Consortium Sequence Ontology (SO) properties and features of nucleic sequences song.sf.net Karen Eilbeck 47 Anatomical Structure Anatomical Space Organ Cavity Subdivision Organ Cavity Organ Serous Sac Cavity Subdivision Serous Sac Cavity Serous Sac Organ Part Organ Component Pleural Sac Pleural Cavity Parietal Pleura Interlobar recess Organ Subdivision Mediastinal Pleura Pleura(Wall of Sac) Visceral Pleura Mesothelium of Pleura Foundational Model of Anatomy Tissue Anatomical Structure Anatomical Space Organ Cavity Subdivision Organ Cavity Organ Serous Sac Cavity Subdivision Serous Sac Cavity Serous Sac Organ Component Organ Subdivision Pleural Sac Pleural Cavity Parietal Pleura Interlobar recess Organ Part Mediastinal Pleura Pleura(Wall of Sac) Visceral Pleura Mesothelium of Pleura Tissue Mature OBO Foundry ontologies now undergoing reform Cell Ontology (CL) Chemical Entities of Biological Interest (ChEBI) Foundational Model of Anatomy (FMA) Gene Ontology (GO) Phenotypic Quality Ontology (PaTO) Relation Ontology (RO) Sequence Ontology (SO) 50 Ontologies being built to satisfy Foundry principles ab initio Ontology for Clinical Investigations (OCI) Common Anatomy Reference Ontology (CARO) Ontology for Biomedical Investigations (OBI) Protein Ontology (PRO) RNA Ontology (RnaO) Subcellular Anatomy Ontology (SAO) 51 Ontologies in planning phase Biobank/Biorepository Ontology (BrO, part of OBI) Environment Ontology (EnvO) Immunology Ontology (ImmunO) Infectious Disease Ontology (IDO) Mouse Adult Neurogenesis Ontology (MANGO) 52 OBO Foundry Success Story Model organism research seeks results valuable for the understanding of human disease. This requires the ability to make reliable crossspecies comparisons, and for this anatomy is crucial. But different MOD communities have developed their anatomy ontologies in uncoordinated fashion. 53 Ontologies facilitate grouping of annotations brain hindbrain rhombomere 20 15 10 Query brain without ontology 20 Query brain with ontology 45 54 CARO – Common Anatomy Reference Ontology for the first time provides guidelines for model organism researchers who wish to achieve comparability of annotations for the first time provides guidelines for those new to ontology work See Haendel et al., “CARO: The Common Anatomy Reference Ontology”, 55Burger (ed.), Anatomy Ontologies for Bioinformatics: Springer, in press. in: CARO-conformant ontologies already in development: Fish Multi-Species Anatomy Ontology (NSF funding received) Ixodidae and Argasidae (Tick) Anatomy Ontology Mosquito Anatomy Ontology (MAO) Spider Anatomy Ontology Xenopus Anatomy Ontology (XAO) undergoing reform: Drosophila and Zebrafish Anatomy Ontologies 56 Part 3 The Hole Story The Ontology of Environments Initial hypothesis: Environments are holes environment place site niche habitat setting hole spatial region interior location Places are holes RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Organism Anatomical Organ (NCBI Entity Function Taxonomy) (FMA, CARO) (FMP, CPRO) Cell (CL) Cellular Component (FMA, GO) Molecule (ChEBI, SO, RnaO, PrO) Phenotypic Quality (PaTO) Biological Process (GO) Cellular Function (GO) Molecular Function (GO) Molecular Process (GO) No place for environments 66 A Neglected Major Category in Ontologies thus far Things (e.g. organisms) Qualities / Features Functions Processes Environments = that into which organisms (etc.) fit RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY CELL AND CELLULAR COMPONENT MOLECULE Anatomical Entity (FMA, CARO) Cell (CL) Cellular Component (FMA, GO) Molecule (ChEBI, SO, RnaO, PrO) Organ Function (FMP, CPRO) environments are here ORGAN AND ORGANISM Organism (NCBI Taxonomy) Phenotypic Quality (PaTO) Biological Process (GO) Cellular Function (GO) Molecular Function (GO) Molecular Process (GO) Environments are holes in which organisms, cells, molecules ... can 68live Environments are holes Double Hole Structure of the Occupied Niche Retainer (a boundary of some surrounding structure) Medium (filling the environing hole) Tenant (occupying the central hole) Tenant, medium and retainer the medium of the bear’s niche is a circumscribed body of air medium might be body of water, cytosol, nasal mucosa, epithelium, endocardium, synovial tissue ... The Empty Niche Fiat boundary Physical boundary Two Types of Boundary Fiat boundary Physical boundary Positive and negative parts positive part (made of matter) negative part or hole (not made of matter) Four Basic Niche Types (Niche as generalized hole) 1 2 3 4 1: a womb; an egg; a house (better: the interior thereof) 2: a snail’s shell; 3: the niche of a pasturing cow; 4: the niche around a circling buzzard (fiat boundary) Types of relations for EnvO in on (surface of) surrounds lives_in attaches to realizes occupies (spatial region) ... Lexical Semantics the fruit is in the bowl the bird is in the nest the lion is in the cage the pencil is in the cup the fish is in the river the river is in the valley the water is in the lake the car is in the garage the fetus is in the cavity in the uterine lining the colony of whooping crane is in its breeding grounds Double Hole Structure Retainer (a boundary of some surrounding structure) Medium (filling the environing hole) Tenant (occupying the central hole) when a tenant leaves its niche the gap left by the tenant is filled immediately by the surrounding medium A hole in the ground Solid physical boundaries at the floor and walls but with a fiat lid: hole Part 4: Not every hole is an environment An environment is a special kind of (generalized) hole but what kind? Elton – niche as role the ‘niche’ of an animal means its place in the biotic environment, its relations to food and enemies. [...] When an ecologist says ‘there goes a badger’ he should include in his thoughts some definite idea of the animal’s place in the community to which it belongs, just as if he had said ‘there goes the vicar’ (Elton 1927, pp. 63f.) G.E. Hutchinson: niche as volume in a functionally defined space the niche = an n-dimensional hypervolume whose dimensions correspond to resource gradients over which species are distributed G.E. Hutchinson (1957, 1965) Hypervolume niche = a location in an attribute space defined by a specific constellation of environmental variables such as degree of slope, exposure to sunlight, soil fertility, foliage density, salinity... Niche Construction Lewontin: niches normally arise in symbiosis with the activities of organisms or groups of organisms (“ecosystem engineering”); they are not already there, like vacant rooms in a gigantic evolutionary hotel, awaiting organisms who would evolve into them. (The Triple Helix, Gene Organism, Environment) Part Last: Bringing Together the Spatial and Functional Approaches to Environment Ontology The environment is not a location in an attribute space, but it must have features have such location Every environment must have some spatial location The functional niche presupposes the spatial-structural niche Ontology of environment + ontology of associated environmental features J. J. Gibson’s Ecological Psychology The terrestrial environment is [best] described in terms of a medium, substances, and the surfaces that separate them. (Gibson 1979, p. 16) Gibson’s theory of surface layout ‘a sort of applied geometry that is appropriate for the study of perception and behavior’ (1979, p. 33) ground, open environment, enclosure, detached object, attached object, hollow object, place, sheet, fissure, stick, fiber, dihedral, etc. Gibson’s theory of surface layout as an anatomy of environments • systems of barriers, doors, pathways to which the behavior of organisms is specifically attuned, • temperature gradients, patterns of movement of air or water molecules • water holes, food sources (features) • apertures (mouths, sphincters ...) Two sets of issues Environments, as spatial structures, and their parts Environmental attributes (qualities, functions), determining multidimensional loci à la Hutchinson Aim To define structural properties such as: open, closed, connected, compact, spatial coincidence, integrity, aggregate, boundary RCC (Region Connection Calculus) plus extensions Ecological Niche Concepts niche as particular place or subdivision of an environment that an organism or population occupies vs. niche as function of an organism or population within an ecological community Next steps Our data needs are to link niche features with geo-locations Scale: From geographic to microbiological From locations of organisms/samples, sources of museum artifacts ... to organism interactions, e.g. on bacterial infection – how the interior of one organism or organism part serves as environment for another organism Hosts for bacterial infection (interior of) lung blood (bacteremia) erythrocyte - plasmodium inhabits red blood cells hepatocyte – plasmodium infects liver cells macrophage gut and oral mucosa, nasal mucosa, vaginal mucosa kidney bladder portion of epithelial tissue C: bacteria (arrows) adhering to and penetrating the epithelial cells (×3,000) D: abscess (Ab) formation in subepithelial region with a colony of bacteria (arrows) and a red blood cell (RBC) in it (×2,000) RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Organism Anatomical Organ (NCBI Entity Function Taxonomy) (FMA, CARO) (FMP, CPRO) Cell (CL) Cellular Component (FMA, GO) Molecule (ChEBI, SO, RnaO, PrO) Phenotypic Quality (PaTO) Biological Process (GO) Cellular Function (GO) Molecular Function (GO) Molecular Process (GO) Environments, environment parts (features), environment qualities 106 Ontologies needed Environment -- Taxonomy place, habitat, city, farm, building (interior), oral cavity, uterine cavity, gut ... Environment part – Anatomy of environments (Surface, conduit, entry ...) city wall, uterine wall, water source, ... Environment function protection, supply of food,... Environment quality – (Phenotypes) ambient temperature, salinity, ...