NichesOxfordAug07

advertisement
Introduction to Ontologies
for Environmental Biology
Barry Smith
http://ontology.buffalo.edu/smith
Finnegans Web
concept
type
class
instance
model
representation
data
process
property
2
Disciplines here involved
GIS
Ecology
Environmental biology
Various -omics disciplines
Bioinformatics
Medical Informatics
Database science
Semantic webists
...
Part 1: What is an Ontology?
4
what cellular component?
what molecular function?
what biological process?
5
natural language labels
designed for use in annotations
to make the data cognitively
accessible to human beings
and algorithmically tractable
to computers
6
compare: legends for maps
7
common legends
allow
(cross-border)
compare:
legends
for mapsintegration
8
ontologies are legends for data
9
compare: legends for diagrams
10
Ramirez et al.
Linking of Digital Images to Phylogenetic Data Matrices Using a
Morphological Ontology
Syst. Biol. 56(2):283–294, 2007
computationally tractable legends
help integrate complex representations
of reality
help human beings find things in
complex representations of reality
help computers reason with complex
representations of reality
12
ontologies are used to annotate data
but there are two kinds of annotations
names of types
16
names of instances
17
A basic distinction
type vs. instance
science text vs. diary
human being vs. Michael Ashburner
18
Catalog vs. inventory
A
B
C
515287
521683
521682
DC3300 Dust Collector Fan
Gilmer Belt
Motor Drive Belt
19
Ontology
types
Instances
20
An ontology is a collection of
standardized names for types
We learn about types in reality from looking
at the results of scientific experiments
captured in the form of scientific theories
Ontologies provide the terminological
scaffolding of scientific theories
experiments relate to what is particular
science describes what is general
21
thing
types
organism
animal
mammal
cat
siamese
frog
instances
22
types vs. their extensions
type
{a,b,c,...}
class of instances
= a collections
of particulars
23
Extension =def
The extension of a type A is the class of
instances of A
(the class of all entities to which the term ‘A’
applies)
24
types vs. classes
types
{c,d,e,...}
classes
25
types vs. classes
types
extensions
~ defined classes
26
Defined class =def
member of Abba aged > 50 years
pizza with > 4 different toppings
red wine to serve with fish
27
Part 2: The OBO Foundry
28
what cellular component?
what molecular function?
what biological process?
29
The Gene Ontology
The Gene Ontology
Five bangs
for your
GO buck
The Gene
Ontology
1. based in biological science
2. cross-species data comparability (human,
mouse, yeast, fly ...)
3. cross-granularity data integration
(molecule, cell, organ, organism)
4. cumulation of scientific knowledge in
algorithmically tractable form
5. links people to software
6. part of Open Biomedical Ontologies (OBO)
32
Entry point for creation of webaccessible biomedical data
GO initially low-tech to encourage users
Simple (web-service-based) tools
created to support the work of biologists
in creating annotations (data entry)
OBO  OWL DL converters now
making OBO Foundry annotated data
immediately accessible to Semantic
Web data integration projects
33
The OBO Foundry
A suite of high quality interoperable
reference ontologies to serve the
annotation of biomedical data
providing guidelines for those who need to
create new ontology resources
http://obofoundry.org
RELATION
TO TIME
CONTINUANT
INDEPENDENT
OCCURRENT
DEPENDENT
GRANULARITY
ORGAN AND
ORGANISM
CELL AND CELLULAR
COMPONENT
MOLECULE
Organism
Anatomical
Organ
(NCBI
Entity
Function
Taxonomy) (FMA, CARO) (FMP, CPRO)
Cell
(CL)
Cellular
Component
(FMA, GO)
Molecule
(ChEBI, SO,
RnaO, PrO)
Phenotypic
Quality
(PaTO)
Biological Process
(GO)
Cellular
Function
(GO)
Molecular Function
(GO)
Molecular Process
(GO)
The OBO Foundry building out from the original GO
35
Simple guidelines
•
•
•
•
•
use singular nouns
distinguish continuants from occurrents
distinguish things from their qualities
distinguish types from their instances
do not use the weasel word ‘concept’
CRITERIA

OPENNESS: The ontology is open and available to be
used by all.

FORMAL LANGUAGE: The ontology is in, or can be
instantiated in, a common formal language.

ORTHOGONALITY: The developers of the ontology
agree in advance to collaborate with developers of other
OBO Foundry ontology where domains overlap.

CONVERGENCE: The developers agree to work
torwards a single ontology for each domain.
http://obofoundry.org/
37
CRITERIA

UPDATE: The developers of each ontology commit to its
maintenance in light of scientific advance, and to
soliciting community feedback for its improvement.

IDENTIFIERS: The ontology possesses a unique
identifier space within OBO.

VERSIONING: The ontology provider has procedures for
identifying distinct successive versions.

DEFINITIONS: The ontology includes textual definitions
for all terms.
http://obofoundry.org/
38
CRITERIA

CLEARLY BOUNDED: The ontology has a clearly
specified and clearly delineated content.

DOCUMENTATION: The ontology is well-documented.

USERS: The ontology has a plurality of independent
users.

COMMON ARCHITECTURE: The ontology uses
relations which are unambiguously defined following the
pattern of definitions laid down in the OBO Relation
Ontology.
http://obofoundry.org/
39
Foundry ontologies all work in the
same way
all are built to represent the types existing in a preexisting domain and the relations between these
types in a way which can support reasoning
– we have data
– we need to make this data available for semantic
search and algorithmic processing
– we create a consensus-based ontology for annotating
the data
– and ensure that it can interoperate with Foundry
ontologies for neighboring domains
40
Formal-Ontological Relations
is_a
part_of
located_at
depends_on
is_boundary_of
adjacent_to
41
To support integration of ontologies
relational expressions such as
is_a
part_of
...
should be used in the same way in all
ontologies involved
42
to define these relations properly
we need to take account of both types
and instances in reality
43
Kinds of relations
<instance, type>: Toronto instance_of
city
<instance, instance>: Toronto part_of
Ontario
<type, type>: waterfall part_of river
44
is_a
human is_a mammal
all instances of the type human are as a
matter of necessity instances of the type
mammal
45
Ontology
Scope
URL
Custodians
Cell Ontology
(CL)
cell types from prokaryotes
to mammals
obo.sourceforge.net/cgibin/detail.cgi?cell
Jonathan Bard, Michael
Ashburner, Oliver Hofman
Chemical Entities of Biological Interest (ChEBI)
molecular entities
ebi.ac.uk/chebi
Paula Dematos,
Rafael Alcantara
Common Anatomy Reference Ontology (CARO)
anatomical structures in
human and model organisms
(under development)
Melissa Haendel, Terry
Hayamizu, Cornelius Rosse,
David Sutherland,
Foundational Model of
Anatomy (FMA)
structure of the human body
fma.biostr.washington.
edu
JLV Mejino Jr.,
Cornelius Rosse
Functional Genomics
Investigation Ontology
(FuGO)
design, protocol, data
instrumentation, and analysis
fugo.sf.net
FuGO Working Group
Gene Ontology
(GO)
cellular components,
molecular functions,
biological processes
www.geneontology.org
Gene Ontology Consortium
Phenotypic Quality
Ontology
(PaTO)
qualities of biomedical entities
obo.sourceforge.net/cgi
-bin/ detail.cgi?
attribute_and_value
Michael Ashburner, Suzanna
Lewis, Georgios Gkoutos
Protein Ontology
(PrO)
protein types and
modifications
(under development)
Protein Ontology Consortium
Relation Ontology (RO)
relations
obo.sf.net/relationship
Barry Smith, Chris Mungall
RNA Ontology
(RnaO)
three-dimensional RNA
structures
(under development)
RNA Ontology Consortium
Sequence Ontology
(SO)
properties and features of
nucleic sequences
song.sf.net
Karen Eilbeck
46
Ontology
Scope
URL
Custodians
Cell Ontology
(CL)
cell types from prokaryotes
to mammals
obo.sourceforge.net/cgibin/detail.cgi?cell
Jonathan Bard, Michael
Ashburner, Oliver Hofman
Chemical Entities of Biological Interest (ChEBI)
molecular entities
ebi.ac.uk/chebi
Paula Dematos,
Rafael Alcantara
Common Anatomy Reference Ontology (CARO)
anatomical structures in
human and model organisms
(under development)
Melissa Haendel, Terry
Hayamizu, Cornelius Rosse,
David Sutherland,
Foundational Model of
Anatomy (FMA)
structure of the human body
fma.biostr.washington.
edu
JLV Mejino Jr.,
Cornelius Rosse
Functional Genomics
Investigation Ontology
(FuGO)
design, protocol, data
instrumentation, and analysis
fugo.sf.net
FuGO Working Group
Gene Ontology
(GO)
cellular components,
molecular functions,
biological processes
www.geneontology.org
Gene Ontology Consortium
Phenotypic Quality
Ontology
(PaTO)
qualities of biomedical entities
obo.sourceforge.net/cgi
-bin/ detail.cgi?
attribute_and_value
Michael Ashburner, Suzanna
Lewis, Georgios Gkoutos
Protein Ontology
(PrO)
protein types and
modifications
(under development)
Protein Ontology Consortium
Relation Ontology (RO)
relations
obo.sf.net/relationship
Barry Smith, Chris Mungall
RNA Ontology
(RnaO)
three-dimensional RNA
structures
(under development)
RNA Ontology Consortium
Sequence Ontology
(SO)
properties and features of
nucleic sequences
song.sf.net
Karen Eilbeck
47
Anatomical
Structure
Anatomical Space
Organ Cavity
Subdivision
Organ
Cavity
Organ
Serous Sac
Cavity
Subdivision
Serous Sac
Cavity
Serous Sac
Organ Part
Organ
Component
Pleural Sac
Pleural
Cavity
Parietal
Pleura
Interlobar
recess
Organ
Subdivision
Mediastinal
Pleura
Pleura(Wall
of Sac)
Visceral
Pleura
Mesothelium
of Pleura
Foundational Model of Anatomy
Tissue
Anatomical
Structure
Anatomical Space
Organ Cavity
Subdivision
Organ
Cavity
Organ
Serous Sac
Cavity
Subdivision
Serous Sac
Cavity
Serous Sac
Organ
Component
Organ
Subdivision
Pleural Sac
Pleural
Cavity
Parietal
Pleura
Interlobar
recess
Organ Part
Mediastinal
Pleura
Pleura(Wall
of Sac)
Visceral
Pleura
Mesothelium
of Pleura
Tissue
Mature OBO Foundry ontologies
now undergoing reform
Cell Ontology (CL)
Chemical Entities of Biological Interest (ChEBI)
Foundational Model of Anatomy (FMA)
Gene Ontology (GO)
Phenotypic Quality Ontology (PaTO)
Relation Ontology (RO)
Sequence Ontology (SO)
50
Ontologies being built to satisfy Foundry
principles ab initio
Ontology for Clinical Investigations (OCI)
Common Anatomy Reference Ontology
(CARO)
Ontology for Biomedical Investigations (OBI)
Protein Ontology (PRO)
RNA Ontology (RnaO)
Subcellular Anatomy Ontology (SAO)
51
Ontologies in planning phase
Biobank/Biorepository Ontology (BrO, part of OBI)
Environment Ontology (EnvO)
Immunology Ontology (ImmunO)
Infectious Disease Ontology (IDO)
Mouse Adult Neurogenesis Ontology (MANGO)
52
OBO Foundry Success Story
Model organism research seeks results valuable for
the understanding of human disease.
This requires the ability to make reliable crossspecies comparisons, and for this anatomy is crucial.
But different MOD communities have developed their
anatomy ontologies in uncoordinated fashion.
53
Ontologies facilitate grouping of annotations
brain
hindbrain
rhombomere
20
15
10
Query brain without ontology 20
Query brain with ontology
45
54
CARO – Common Anatomy
Reference Ontology
for the first time provides guidelines for model
organism researchers who wish to achieve
comparability of annotations
for the first time provides guidelines for those
new to ontology work
See Haendel et al., “CARO: The Common Anatomy Reference Ontology”,
55Burger (ed.), Anatomy Ontologies for Bioinformatics: Springer, in press.
in:
CARO-conformant ontologies
already in development:
Fish Multi-Species Anatomy Ontology (NSF funding
received)
Ixodidae and Argasidae (Tick) Anatomy Ontology
Mosquito Anatomy Ontology (MAO)
Spider Anatomy Ontology
Xenopus Anatomy Ontology (XAO)
undergoing reform: Drosophila and Zebrafish
Anatomy Ontologies
56
Part 3
The Hole Story
The Ontology of Environments
Initial hypothesis:
Environments are holes
environment
place
site
niche
habitat
setting
hole
spatial region
interior
location
Places are holes
RELATION
TO TIME
CONTINUANT
INDEPENDENT
OCCURRENT
DEPENDENT
GRANULARITY
ORGAN AND
ORGANISM
CELL AND CELLULAR
COMPONENT
MOLECULE
Organism
Anatomical
Organ
(NCBI
Entity
Function
Taxonomy) (FMA, CARO) (FMP, CPRO)
Cell
(CL)
Cellular
Component
(FMA, GO)
Molecule
(ChEBI, SO,
RnaO, PrO)
Phenotypic
Quality
(PaTO)
Biological Process
(GO)
Cellular
Function
(GO)
Molecular Function
(GO)
Molecular Process
(GO)
No place for environments
66
A Neglected Major Category in
Ontologies thus far
Things (e.g. organisms)
Qualities / Features
Functions
Processes
Environments = that into which
organisms (etc.) fit
RELATION
TO TIME
CONTINUANT
INDEPENDENT
OCCURRENT
DEPENDENT
GRANULARITY
CELL AND
CELLULAR
COMPONENT
MOLECULE
Anatomical
Entity
(FMA,
CARO)
Cell
(CL)
Cellular
Component
(FMA, GO)
Molecule
(ChEBI, SO,
RnaO, PrO)
Organ
Function
(FMP, CPRO)
environments
are here
ORGAN AND
ORGANISM
Organism
(NCBI
Taxonomy)
Phenotypic
Quality
(PaTO)
Biological
Process
(GO)
Cellular
Function
(GO)
Molecular Function
(GO)
Molecular Process
(GO)
Environments are holes in which
organisms, cells, molecules ... can 68live
Environments are holes
Double Hole Structure of the
Occupied Niche
Retainer
(a boundary of some
surrounding structure)
Medium
(filling the environing hole)
Tenant
(occupying the central hole)
Tenant, medium and retainer
the medium of the bear’s niche is a
circumscribed body of air
medium might be body of water, cytosol,
nasal mucosa, epithelium, endocardium,
synovial tissue ...
The Empty Niche
Fiat boundary
Physical boundary
Two Types of Boundary
Fiat boundary
Physical boundary
Positive and negative parts
positive
part
(made of matter)
negative
part
or hole
(not made
of matter)
Four Basic Niche Types
(Niche as generalized hole)
1
2
3
4
1: a womb; an egg; a house (better: the interior thereof)
2: a snail’s shell;
3: the niche of a pasturing cow;
4: the niche around a circling buzzard (fiat boundary)
Types of relations for EnvO
in
on (surface of)
surrounds
lives_in
attaches to
realizes
occupies (spatial region)
...
Lexical Semantics
the fruit is in the bowl
the bird is in the nest
the lion is in the cage
the pencil is in the cup
the fish is in the river
the river is in the valley
the water is in the lake
the car is in the garage
the fetus is in the cavity in the uterine lining
the colony of whooping crane is in its breeding grounds
Double Hole Structure
Retainer
(a boundary of some
surrounding structure)
Medium
(filling the environing hole)
Tenant
(occupying the central hole)
when a tenant leaves its niche the gap
left by the tenant is filled immediately
by the surrounding medium
A hole in the ground
Solid physical boundaries at the floor
and walls
but with a fiat lid:
hole
Part 4: Not every hole is an
environment
An environment is a special kind
of (generalized) hole
but what kind?
Elton – niche as role
the ‘niche’ of an animal means
its place in the biotic environment, its
relations to food and enemies. [...]
When an ecologist says ‘there goes a badger’
he should include in his thoughts some
definite idea of the animal’s place in the
community to which it belongs,
just as if he had said ‘there goes the vicar’
(Elton 1927, pp. 63f.)
G.E. Hutchinson: niche as volume
in a functionally defined space
the niche = an n-dimensional hypervolume whose dimensions correspond to
resource gradients over which species are
distributed
G.E. Hutchinson (1957, 1965)
Hypervolume niche = a location
in an attribute space
defined by a specific constellation of
environmental variables such as degree of
slope, exposure to sunlight, soil fertility,
foliage density, salinity...
Niche Construction
Lewontin: niches normally arise in symbiosis
with the activities of organisms or groups of
organisms (“ecosystem engineering”);
they are not already there, like vacant rooms in a
gigantic evolutionary hotel, awaiting organisms
who would evolve into them. (The Triple Helix,
Gene Organism, Environment)
Part Last: Bringing Together the
Spatial and Functional Approaches
to Environment Ontology
The environment is not a location in an
attribute space, but it must have features
have such location
Every environment must have
some spatial location
The functional niche presupposes the
spatial-structural niche
Ontology of environment + ontology of
associated environmental features
J. J. Gibson’s Ecological
Psychology
The terrestrial environment is [best]
described in terms of a medium,
substances, and the surfaces that
separate them. (Gibson 1979, p. 16)
Gibson’s theory of surface layout
‘a sort of applied geometry that is
appropriate for the study of perception and
behavior’ (1979, p. 33)
ground, open environment, enclosure,
detached object, attached object, hollow
object, place, sheet, fissure, stick, fiber,
dihedral, etc.
Gibson’s theory of surface layout
as an anatomy of environments
• systems of barriers, doors, pathways to
which the behavior of organisms is
specifically attuned,
• temperature gradients, patterns of
movement of air or water molecules
• water holes, food sources (features)
• apertures (mouths, sphincters ...)
Two sets of issues
Environments, as spatial structures, and
their parts
Environmental attributes (qualities,
functions), determining multidimensional
loci à la Hutchinson
Aim
To define structural properties such as:
open, closed,
connected, compact,
spatial coincidence,
integrity,
aggregate,
boundary
RCC (Region Connection Calculus) plus
extensions
Ecological Niche Concepts
niche as particular place or subdivision of an
environment that an organism or
population occupies
vs.
niche as function of an organism or
population within an ecological community
Next steps
Our data needs are to link niche features
with geo-locations
Scale: From geographic to
microbiological
From locations of organisms/samples,
sources of museum artifacts ...
to organism interactions, e.g. on bacterial
infection – how the interior of one
organism or organism part serves as
environment for another organism
Hosts for bacterial infection
(interior of) lung
blood (bacteremia)
erythrocyte - plasmodium inhabits red blood
cells
hepatocyte – plasmodium infects liver cells
macrophage
gut and oral mucosa, nasal mucosa, vaginal
mucosa
kidney
bladder
portion of epithelial tissue
C: bacteria (arrows) adhering to and
penetrating the epithelial cells (×3,000)
D: abscess (Ab) formation in subepithelial
region with a colony of bacteria (arrows)
and a red blood cell (RBC) in it (×2,000)
RELATION
TO TIME
CONTINUANT
INDEPENDENT
OCCURRENT
DEPENDENT
GRANULARITY
ORGAN AND
ORGANISM
CELL AND CELLULAR
COMPONENT
MOLECULE
Organism
Anatomical
Organ
(NCBI
Entity
Function
Taxonomy) (FMA, CARO) (FMP, CPRO)
Cell
(CL)
Cellular
Component
(FMA, GO)
Molecule
(ChEBI, SO,
RnaO, PrO)
Phenotypic
Quality
(PaTO)
Biological Process
(GO)
Cellular
Function
(GO)
Molecular Function
(GO)
Molecular Process
(GO)
Environments, environment parts (features),
environment qualities
106
Ontologies needed
Environment -- Taxonomy
place, habitat, city, farm, building (interior), oral cavity,
uterine cavity, gut ...
Environment part – Anatomy of
environments (Surface, conduit, entry ...)
city wall, uterine wall, water source, ...
Environment function
protection, supply of food,...
Environment quality – (Phenotypes)
ambient temperature, salinity, ...
Download