slides - Ontology Research Group

advertisement
Ontology: Not Just for
Philosophers Anymore
Robert Arp, Ph.D.
- The Ontology Research Group (ORG)
www.org.buffalo.edu
- The National Center for Biomedical Ontology (NCBO)
www.bioontology.org
Special thanks to Barry Smith and Werner Ceusters for comments and material from articles, books, and
presentations.
This work was funded by the National Institutes of Health through the NIH Roadmap for Biomedical Research, Grant 1 U 54 HG004028.
Information on the National Centers for Biomedical Computing can be found at: http://nihroadmap.nih.gov/bioinformatics.
Three Parts To Talk:
I:
Meanings of ‘Ontology’
II:
Basic Formal Ontology
(BFO)
III:
The Vision and Mission of
the Ontology Research
Group (ORG)
Part I:
Meanings of
‘Ontology’
(1) Philosophical Ontology
“I can fit wholesale evolution and a creating god into my
ontology without contradiction.”
“Just because it has mental existence doesn’t mean it has
ontological existence.”
(2) Domain Ontology
“I’m working on an ontology for annelids.”
“The Gene Ontology has data on that HOX gene.”
(3) Formal Ontology
“This upper level ontology should help organize these
domains.”
“IEEE just came out with the latest version of SUO that
may solve some of these problems.”
(1) Philosophical Ontology
- Ontos (being, existence)+ Logos (word, account,
explanation)
- The study of what is, of the kinds and structures of
objects, properties, events, processes, and relations in
every area of reality.
- Theoretical discipline concerned with accurately
describing the taxonomy of all things that exist according
to underlying entities and principles that make things:
A) BE
what they are.
B) BE KNOWN AS what they are.
- Synonymous with classical Metaphysics.
THING
MATERIAL
SUBSTANCE
IMMATERIAL
SUBSTANCE
ANIMATE (Living)
ENTITY
NON-ANIMATE
ENTITY
LIVING ENTITY
WITH
SENSATION
(ANIMAL)
LIVING ENTITY
W/OUT
SENSATION
(VEGETATION)
RATIONAL
ANIMAL
NON-RATIONAL
ANIMAL
HUMAN
E.G., Plato
Aristotle
Dr. Sucheston
Dr. Arp
PORPHYRIAN
TREE
Cf. Linnean
Taxonomy and
The Periodic
Table
To a certain extent, all
of us are Philosophical
Ontologists in that we
naturally and
automatically
categorize any and all
things in reality so as
to understand,
explain, control,
dominate, and
navigate reality.
Different Schools / Approaches
to Philosophical Ontology
• We can’t know reality because we can’t get beyond our sensations,
perceptions, and/or ideas of reality (Idealism)
• We can only know the theories, languages, concepts, or systems of
beliefs about reality, and reality is what minds make it (Antirealism)
• We can know reality “out there” as a world beyond our minds, and
reality, ultimately, is in no way (e)affected by our minds (Realism)
• Reality is one kind of thing: all mind (Mental Monism)
• Reality is one kind of thing: all matter (Material Monism)
• Reality is two kinds of things: mind and matter (Dualism)
• Only the Bible accurately depicts reality (Fundamentalism)
• Only science accurately depicts reality (Scientism)
(2) Domain Ontology
- Representation of the entities and relations existing within a
particular domain of reality such as medicine, geography,
ecology, or law, e.g., GO, FMA, EnvO.
- Opposed to ontology in the philosophical sense, which has all
of reality as its subject matter.
- Ideally, provides a controlled, structured vocabulary to
annotate data in order to make it more easily searchable by
human beings and processable by computers.
- Synonymous (for some) with ‘Reference Ontology.’
- ‘Task’ or ‘Application’ Ontology: runs, uses, exploits a domain
ontology.
AN ONTOLOGY (Ontology Research Group):
“a representational artifact, comprising a taxonomy as its
main part, whose representational units are intended to
designate some combination of universals, defined
classes, and certain relations between them.” *
E.G.,
The Gene Ontology (GO)
The Foundational Model of Anatomy Ontology (FMA)
The Environment Ontology (EnvO)
* Smith et al., “Towards a Reference Terminology for Ontology
Research and Development in the Biomedical Domain,” Proc KRMed
2006: http://ontology.buffalo.edu/bfo/Terminology_for_Ontologies.pdf
A REALISM-BASED ONTOLOGY:
“is built out of representational units which are
intended to refer exclusively to (real) universals, and
corresponds to that part of the content of a scientific
theory that is captured by its constituent general
terms and the interrelations between the universals
denoted by these terms.” * (again, ORG definition)
Contrasted with:
- Idealism-Based Ontology
- Antirealism-Based Ontology
* Smith et al., “Towards a Reference Terminology for Ontology Research
and Development in the Biomedical Domain,” Proc KRMed 2006:
http://ontology.buffalo.edu/bfo/Terminology_for_Ontologies.pdf
Informatics:
The science of information
collection, categorization,
management, storage,
processing, retrieval, and
dissemination.
(Arp’s rendition)
Bioinformatics:
“A discipline of quantitative analysis of
information relating to biological macromolecules with the aid of computers.”
Jin Xiong, Essential Bioinformatics (Cambridge University Press, 2006), 3.
“…developed in the space occupied with
mathematical and computational biology,
biometry and biostatistics, computer science,
cybernetics, molecular evolution, genomics and
proteomics, genetics, and molecular and cell
biology.”
Polanski and Kimmel, Bioinformatics (Verlag: Springer, 2007), 2-3.
Domain ontology is contrasted with:
- Database: stores data of ontology or whatever info.
- Rule-based Language (e.g., XSD): tells you how to store,
control, and describe an ontology or whatever info.
- Thesaurus: taxonomy coupled with relations
- Taxonomy: terms and glosses organized into subsumed
hierarchical relations
- Glossary: catalogue of glosses (translations) in a language
- Catalogue: set of terms with meanings
- Inventory: checklist of items, terms, entities
- Axiomatic Theory: formal system with clear rules and
semantics
However, it is arguable that an Ontology can be characterized
as a hybrid of a Taxonomy and an Axiomatic Theory.
Example Ontology
BORROWED FROM:
http://www.bio.davidson
.edu/courses/genomics/2
006/martens... 3DN
A Gene Ontology
Example
The Information Age: A Sea of Information
- Varying perspectives, methodologies, ideas, and… DATA
- More information than humans can handle
- Extraordinary depth, magnitude, and… CHAOS
- Plenty of human error
RESULT:
- More DOMAINS that are non-interoperable, non-communicative,
isolated, insolated , encapsulated “silos” of information
- Lost at sea? In the sea?
Genetics
Diseases
Ecology
Evolution
Primatology
Cardiology
Informatics Problems that
Contribute to Being Lost at Sea:
- Dumb Beast
- Nonsense-In-Nonsense-Out
- Computer Solipsism
- Human Idiosyncrasy
- Tower of Babel
- Pressures from Insurance Companies
- Legal Pressures
- Human Error: Incorrect Thinking (IT)
IT: Simply Getting the Facts Wrong *
FROM GO, SNOMED, BRIDG, and UMLS
(1) “extracellular region is_a cellular component”
(2) “extrinsic to membrane part_of membrane”
(3) ‘derives from’ confused with ‘develops from’
(4) “both testes is_a testis”
(5) Animal =Def. “A non-person living entity…”
(6) “An ontology is the same thing as a database…”
(7) “An ontology is just a taxonomy…”
* N.B. It may be the case that the examples of IT used in this
presentation have been resolved. No matter, (sadly) there are
legion examples of IT to be found.
IT: Lack of Clear and Coherent Definitions
FROM NCIT, BRIDG, and SNOMED:
(1) Try and Define: Cancer, Gene, Neuropathy, Disease,
Infectious Disease, Bios Itself... admittedly difficult.
(2) Disease Progression =Def. “Cancer that continues to
grow and spread,” and “Increase in size of tumor…,”
and “The worsening of a disease over time”
(3) Person =Def. “Human being”
(4) “European is_a ethnic group”
(5) “Other European in New Zealand is_a ethnic group”
(6) “Mixed ethnic census group is_a ethnic group”
IT: Circular Definitions
FROM GO and BRIDG
(1) Hemolysis of red blood cells
=Def. “The processes by which an
organism effects hemolysis”
Cf. Filtration of kidneys
=Def. “The processes by which an
organism effects filtration (of kidneys)”
(2) Ingredient =Def. “A substance that acts as an ingredient within a
product. Note that ingredients may also have ingredients.
(3) Protection from natural killer cell mediated cytolysis =Def. “The
process of protecting a cell from cytolysis by natural killer cells”
IT: Examples Instead of Definitions
FROM BRIDG
(1) Adverse Event =Def.
Basic Mistakes in
Definitions: 101
See Plato’s Euthyphro.
“Holiness is what I’m
doing in prosecuting my
father…”
(a) “toxic reaction”…
(b) “…untoward occurrence in a subject
At least one reason why
administered a pharmaceutical product…” we need Philosophers?
(c) “An unfavorable and unintended
reaction, symptom, syndrome, or disease
encountered by a subject on a clinical trial…”
(2) Defeasibility =Def. “a line of communication that
is terminated,” “boundaries for software”
IT: Use-Mention Confusion
FROM BIRN, MeSH, NCIT, and HL7
(1) Mouse =Def. “Name for the species Mus musculus”
(2) “National Socialism is_a MeSH Descriptor”
(3) Conceptual Entities =Def. “An organizational
header for concepts representing mostly
abstract entities”
(4) Animal =Def. “a subtype of Living Subject
representing any animal-of-interest to the
Personnel Management domain”
(5) “living subject is_a code system ”
IT: Conception/Perception vs. Reality Confusion
FROM NCIT and UMLS
(1) Living subject =Def. “An object representing an organism”
(2) Class performed activity =Def. “The description of applying,
dispensing or giving agents or medications to subjects”
(3) Adverse Event =Def. “An observation of a change in the
state of a subject that is assessed as being untoward…”
(4) Objective Result =Def. “An act of monitoring, recognizing
and noting reproducible measurement…”
(5) “Individual allele is_a act of observation ”
(6) “Cancer documentation is_a cancer”
(7) “Bacterium causes experimental model of disease”
Lost at Sea
Lost in the Sea
Domain Ontology
e.g., genetics
Domain Ontology
diseases
Domain Ontology
ecology
Domain Ontology
evolution
Domain Ontology
primatology
Domain Ontology
cardiology
(3) Formal Ontology… Salvation
- A discipline which assists in making communication
between and among domain ontologies possible by
providing a common language and common formal
framework for reasoning.
“The fundamental role of an ontology is to support
knowledge sharing and reuse.”
J. Domingue and E. Motta, "A knowledge-based news server supporting ontologydriven story enrichment and knowledge retrieval," in Knowledge Acquisition,
Modeling and Management: 11th European Workshop, EKAW 99 Proceedings, ed.
D. Fensel and R. Studer (Berlin: Springer, 1999), 104.
(3) Formal Ontology… Salvation
- Concerns, at least:
(a) adoption of a set of basic categories of objects
(b) discerning what kinds of entities fall within each
of these categories of objects
(c) determining what relationships hold within and
amongst the different categories in the
domain ontology.
- Relies on philosophical ontology (thus, people like
Smith, Ceusters, Goldberg, Arp and others in the
Ontology Research Group doing this work).
(3) Formal Ontology… Salvation
- Synonymous (for some) with ‘upper level,’ ‘higherlevel,’ ‘top-level,’ ‘backbone,’ ‘general,’ ‘generic,’
ontology.
- Applied in bioinformatics, intelligence analysis,
management science, and in other scientific and
business fields, where it serves as a basis for the
improvement of classification, information
organization, and automatic reasoning… helping to
navigate the sea of information.
(3) Formal Ontology… Salvation
EXAMPLES:
(a) SUO
Standard Upper Ontology
(b) DOLCE
Descriptive Ontology for Linguistic and
Cognitive Engineering
(c) BFO
Basic Formal Ontology
Formal Ontology is like a “backbone” or “spine” making
communication, interoperability, and optimal dissemination of
information possible between and among domain ontologies.
Domain Ontology
e.g., genetics
Domain Ontology
diseases
Domain Ontology
ecology
Domain Ontology
evolution
Domain Ontology
primatology
Domain Ontology
cardiology
Domain Ontology
primatology
Domain Ontology
cardiology
Formal Ontology
e.g., BFO
From This
To This
Domain Ontology
e.g., genetics
Domain Ontology
diseases
Domain Ontology
ecology
Domain Ontology
evolution
RESULT:
No longer lost at sea or lost in the sea
of information, biomedical or otherwise.
Formal Ontology
e.g., BFO
Domain Ontology
e.g., genetics
Domain Ontology
diseases
Domain Ontology
ecology
Domain Ontology
evolution
Domain Ontology
primatology
Domain Ontology
cardiology
Part II:
Basic Formal
Ontology (BFO)
BFO: How Does It Work?
General Preliminaries
- Formal: “applicable to all domains of objects...”
Barry Smith and David Woodruff Smith, The Cambridge Companion to
Husserl, ed. Barry Smith and David Woodruff Smith (Cambridge:
Cambridge University Press, 1995), 28.
- Relevancy
- Perspectivalism
- Granularity
- Fallibility
REALISM-BASED ONTOLOGY
(a) Universals
(1) Real Things or Continuants
SNAP shots of reality
(2) Real Processes or Occurrents
SPAN of time
(b) Relations
(which are also universals of a
different type)
SNAP
Continuant
Universal: SNAP
Object
Relation: is_a
Independent
Dependent
Spatial
Continuant
Continuant
Region
Generically
Specifically
Zero
Dependent
Dependent
Dimensional
Continuant
Continuant
Region
Object
Boundary
Quality
Realizable
One
Entity
Dimensional
Region
Object
Aggregate
Disposition
Two
Fiat Object
Part
Site
Dimensional
Function
Role
Region
Three
Dimensional
Region
SNAP
Continuant
Example:
Independent
Dependent
Spatial
Continuant
Continuant
Region
Object
Human
Heart
Generically
Specifically
Zero
Dependent
Dependent
Dimensional
Continuant
Continuant
Region
Object
Boundary
Surface of
the Heart
Object
Aggregate
All Hearts in
This Room
Fiat Object
Part
A Biopsy of
the Heart
Quality
Realizable
One
Entity
Dimensional
Pink, Smooth
Region
Disposition
Stops if No
Circulation
Two
Dimensional
Function
Pumps Blood
Role
Site
Chest Cavity
Region
Three
Dimensional
Region
SPAN
Universal: SPAN
Process
Relation: is_a
Occurrent
Processual
Spatiotemporal
Temporal
Entity
Region
Region
Scattered
Connected
Scattered
Connected
Spatiotemporal
Spatiotemporal
Temporal
Temporal
Region
Region
Region
Region
Process
Boundary
Spatiotemporal
Temporal
Instant
Instant
Spatiotemporal
Temporal
Interval
Interval
Process
Aggregate
Fiat Process
Part
Processual
Context
SPAN
Example:
Occurrent
Processual
Spatiotemporal
Temporal
Entity
Region
Region
Process
ECG (EKG)
Test
Scattered
Connected
Scattered
Connected
Spatiotemporal
Spatiotemporal
Temporal
Temporal
Region
Region
Region
Region
Process
Boundary
Start/End
of ECG
Process
Aggregate
Fiat Process
Part
Processual
Context
All ECGs in
Clinic
2nd Lead
Attached
Test
Context
S/T ECG
Began
S/T Region
of ECG
Spatiotemporal
Instant
Spatiotemporal
Interval
Moment
ECG Began
Time
Occupied
Temporal
Instant
Temporal
Interval
BFO RESOURCES
IFOMIS BFO Website:
httpp://www.ifomis.uni-saarland.de/bfo/
Barry Smith’s Website:
http://ontology.buffalo.edu/smith/
Barry Smith’s Articles:
e.g., http://ontology.buffalo.edu/smith/
articles/SNAP_SPAN.pdf
CONCRETE STEPS:
(1) Explicitly demarcate the entities of domain ontology
(2) Determine the universals and relations in domain
(3) Concretize information in a representational artifact
(4) Regiment the information to ensure:
a) logical, philosophical, and scientific coherence
b) compatibility with other relevant ontologies
c) human intelligibility
(5) Formalize in a computer tractable language
(6) Implement in some specific computing context
A LOT OF THIS ACTUALLY HAPPENS AT MEETINGS…
The Countless
Cs of
Computational
Categorization:
From
Cognizance
To
Coordination
To
Comfort
Cognizance of Informatics Problems
Cooperation of Researchers, Doctors…
Conferences, Colloquia, Meetings…
Clarity of Terms and Relations
Cogency: Counter-Example Free?
Coherency of Domain Ontologies
Coordination of Domain Ontologies
Computational Tractability
Communicability of Information
Coding of Information Correctly
Convenience of Accessibility to Information
Care of Humans/Animals (First, Do No Harm)
Comfort of Humans/Animals
COORDINATION OF DOMAIN ONTOLOGIES
Part III:
The Vision and
Mission of the
Ontology Research
Group (ORG)
THE ONTOLOGY RESEARCH GROUP
(ORG): www.org.buffalo.edu
The ORG currently has three sub-units:
(1) The Ontology, Logic and
Technology Unit (OLT) is engaged in
foundational ontology research and
content development, especially in
the biomedical domain.
(2) The Referent Tracking Unit
(RTU) carries out applied
research and software
development pertaining to
electronic health records and
other data resources in the
biomedical domain.
(3) The Qualitative Spatiotemporal
Reasoning Unit (QSR) is applying
ontological techniques derived from
qualitative spatiotemporal reasoning and
the field of Geographic Information
Systems in order to improve the
representation of canonical anatomy, as
well as the processing of X-ray, MRI, and
other forms of image and signal data.
ORG
OLT
RTU
QSR
The VISION of the ORG is to assist scientific
researchers, especially biomedical
researchers, in providing a single, cumulative,
and algorithmically processable database of
information in their respective scientific
domains.
The MISSION of the ORG is to realize this
vision by supporting researchers in the
creation and application of high-quality
domain ontologies that enable efficient
translational research and optimal clinical
care.
CONCRETELY, THIS MEANS:
- ORG researchers are playing leading roles in
a number of national and international
ontology research consortia, and they have
organized a wide variety of ontology training
and dissemination events.
- The ORG is constantly involved in the
organization and participation of workshops,
conferences, colloquia, and other events all
around the world.
See: http://org.buffalo.edu/rarp/Presentations.html
...AS WELL AS PROFFERING OF BFO
COORDINATION OF DOMAIN ONTOLOGIES
A Few ORG Collaborators:
- NCBO http://bioontology.org/
- NCOR http://ncor.us/
- ECOR http://www.ecor.uni-saarland.de/home.html
- OBO Foundry Project http://obofoundry.org/
- UB http://philosophy.buffalo.edu/contrib/graduate/areas_of_study/phd.shtml
- IFOMIS http://www.ifomis.uni-saarland.de/
- RIDE http://www.srdc.metu.edu.tr/webpage/projects/ride/
- Industrial Collaborations
Medtuity, Inc. http://www.org.buffalo.edu/RTU/indcollabs.html
Sigmund Software http://www.sigmundsoftware.com/
Thank You
Robert Arp, Ph.D.
- The Ontology Research Group (ORG)
www.org.buffalo.edu
- The National Center for Biomedical Ontology (NCBO)
www.bioontology.org
Special thanks to Barry Smith and Werner Ceusters for comments and material from articles, books, and
presentations.
This work was funded by the National Institutes of Health through the NIH Roadmap for Biomedical Research,
Grant 1 U 54 HG004028.
Information on the National Centers for Biomedical Computing can be found at:
http://nihroadmap.nih.gov/bioinformatics.
Download