WüsteriaMar03 - Buffalo Ontology Site

advertisement
International Standard Bad Philosophy
Barry Smith
http://ontologist.com
Organism
Organ
10-1 m
Tissue
Cell
10-5 m
Organelle
Protein
DNA
10-9 m
A new golden age of classification
30,000 genes in human
200,000 proteins
100s of cell types
100,000s of disease types
1,000,000s of biochemical pathways
(including disease pathways)
… legacy of Human Genome Project
“annotations”
controlled vocabularies
How overcome incompatibilities
between different scientific index
terms?
immunology
genetics
cell biology
Open Biological Ontologies
Consortium
Gene Ontology
Cell Ontology
Sequence Ontology
Mouse Anatomy Ontology
etc.
http://obo.sourceforge.net/
Unified Medical Language System (UMLS)
UMLS Metathesaurus:
1 million biomedical concepts
2.8 million concept names
from more than 100 controlled vocabularies
and classifications
built by US National Library of Medicine
UMLS
a compendium of source vocabularies including:
SNOMED (Systematized Nomenclature of
Medicine
ICD International Classification of Diseases
MeSH – Medical Subject Headings
Foundational Model of Anatomy
LOINC (Logical Observation Identifiers
Names and Codes)
To reap the benefits of standardization
we need to make ONE SYSTEM out of
many different terminologies
=
UMLS “Semantic Network”
nearest thing to an “ontology” in the UMLS
UMLS SN
134 Semantic Types
54 types of edges (relations)
yielding a graph containing more than 6,000
edges
Fragment of UMLS SN
Axioms
UMLS Semantic Network
entity
is_a
physical
object
organism
conceptual
entity
Fruit
similarTo
Vegetable
NarrowerTerm
Orange
synonymWith
Apfelsine
Graph with labels edges (similarTo,
Narrower, synonymWith)
Fixed set of edge labels (a.k.a.
relations)
Goble & Shadbolt
UMLS SN
is_a =def.
If one item ‘is_a’ another item
then the first item is more specific
in meaning than the second item.
(Italics added)
fish is_a vertebrate
copulation is_a biological process
both testes is_a testis
plant parts is_a plant
Fragment of UMLS SN
What are the nodes in this graph?
Almost all nodes are linked to other nodes
by a multiplicity of different types of edges
Compare: swimming is healthy
swimming has 8 letters
Semantic Network Definition:
Concept =def. An abstract concept, such as a
social, religious, or philosophical concept
How can concepts figure as relata
of these relations?
part_of =def. Composes, with one or more
other physical units, some larger whole
causes =def. Brings about a condition or
an effect.
contains =def. Holds or is the receptacle
for fluids or other substances.
How can a concept serve as a
receptacle for fluids or other
substances?
How can concepts stand in relations
such as affects or causes?
connected_to =def.
Directly attached to another
physical unit as tendons are
connected to muscles.
How can a concept be directly attached to
another physical unit?
Fragment of UMLS SN
Experimental Model of Disease affects
Fungus
Bacterium causes Experimental Model of
Disease
Biomedical or Dental Material causes
Mental or Behavioral Dysfunction
Manufactured Object causes Disease or
Syndrome
part of the UMLS Semantic Network
UMLS Semantic Network
entity
physical
object
event
conceptual
entity
UMLS Semantic Network
entity
physical
object
event
conceptual
entity
conceptual entity
Organism Attribute
Finding
Idea or Concept
Occupation or Discipline
Organization
Group
Group Attribute
Intellectual Product
Language
Conceptual Entity
Idea or Concept
Functional Concept
Qualitative Concept
Quantitative Concept
Spatial Concept
Body Location or Region
Body Space or Junction
Geographic Area
Molecular Sequence
Amino Acid Sequence
Carbohydrate Sequence
Nucleotide Sequence
Tonawanda
Tonawanda
is an Idea or Concept
gene part_of cell component
body system conceptual_part_of
fully formed anatomical structure
conceptual
entity
idea or concept
functional concept
body system
But:
Gene or Genome is defined as: “A
specific sequence … of nucleotides
along a molecule of DNA or RNA …”
and
nucleotide sequence is_a conceptual
entity
entity
physical
object
conceptual
entity
idea or concept
confusion of
entity and
concept
functional concept
body system
Functional Concept:
Body system is_a Functional Concept.
but:
Concepts do not perform functions or have
physical parts.
This:
is not a
concept
Problem: Confusion of Is_A and
Has_Role
Physical Entity
Chemical Entity
Chemical
Viewed
Structurally
Chemical
Viewed
Functionally
Chemical
Chemical
Viewed
Structurally
Inorganic Organic
Chemical Chemical
Chemical
Viewed
Functionally
Enzyme
Biomedical or
Dental Material
Chemical Viewed Structurally vs.
Chemical Viewed Functionally
reflects a distinction between types of
classification – not between types of entity
compare a classificationof people into:
tall people,
people who play tennis,
people who look like flies from a distance
etc.
The Hydraulic Equation
BP = CO*PVR
arterial blood pressure is directly
proportional to the product of blood flow
(cardiac output, CO) and peripheral
vascular resistance (PVR)
Confusion of Ontology and Epistemology
blood pressure is an Organism Function,
cardiac output is a Laboratory or Test Result
or Diagnostic Procedure
BP = CO*PVR thus asserts that
blood pressure is proportional either to a
laboratory or test result or to a diagnostic
procedure
Disease History
is classified by UMLS under Health Care
Activity
This runs together
the history or course of a disease on the
side of the patient (ontology)
with
the act of eliciting that history
(epistemology).
Further Principles
univocity: terms should have the same
meanings (and thus point to the same
referents) on every occasion of use
UMLS-SN:
‘organization’ = body plan
‘organization’ = social organization
rules for definitions
intelligibility: the terms used in a definition
should be simpler (more intelligible) than
the term to be defined
otherwise the definition provides no
assistance to the understanding (for
humans)
or is unprocessable (for machines)
UMLS-SN Semantic Relations
Semantic Relation:
functionally_related_to
TUI: T139
Definition: Related by the carrying out of
some function or activity.
Inverse: functionally_related_to
An unintuitive top-level
with unintuitive (or no) rules for classification
and definition
leads to coding errors
difficulties in training of curators
obstacles to alignment with other ontology
and terminology systems
obstacles to harvesting content in automatic
reasoning systems
The UMLS Semantic Network
is ‘an upper-level ontology … in which all
concepts are given a consistent and
semantically coherent representation’.
Alexa McCray, “An upper level ontology for the
biomedical domain”. Comp Functional Genomics
2003; 4: 80-84.
CEN/TC251 ENV 12264 :
This ENV is applicable to the description of
the categorial structure of systems of
concepts supporting computer-based
terminological systems, including coding
systems, for health-care.
– concept : “unit of thought constituted through
abstraction on the basis of properties common to
a set of one or more referents”
BUT THEY NEVER IN FACT LOOK AT THE
REFERENTS AT ALL!
ISO/TC215/N142: Health informatics —
Vocabulary of terminology
– The purpose of this International Standard is
to define a set of basic concepts required to
describe and discuss formal representation of
concepts and characteristics, for use
especially in formal computer based concept
representation systems.
– concept: “unit of knowledge created by a
unique combination of characteristics”
THEY ARE ALREADY TWO LEVELS REMOVED
FROM THE REFERENT!
CEN/TC 251
Europe-wide acceptance of the need for a
comprehensive, communicable and
secure pan-European Electronic Health
Record as a prerequisite for high-quality
healthcare.
A problem for terminology
integration
EHRs across Europe need to use equivalent
terms for equivalent disorders.
standardized clinical terminologies now exist
in an abundance of different flavors.
the Unified Medical Language System
(UMLS) contains over 100 systems, with in
all some 3 million medical “concepts”
we need international standards for
terminologies
responsibility of ISO Technical Committee
(TC) 37
ISO TC 37
founded in 1952
by Eugen Wüster (1898-1977)
businessman, saw-manufacturer, and
professor of woodworking machinery in
the Vienna Agricultural College
fan of the Vienna Circle unified science
movement
devotee of Esperanto
Wüster
chaired TC 37 for the first 20 years of its
existence
was principal author of the documents
which have served as the basis for work in
terminology standardization ever since.
astonishing influence due to normative
character of ISO definitions
1935
Wüster’s theory of concept acquisition
All knowledge of concepts starts out from
sensory experience:
The new-born infant finds itself “constantly
amidst a panoply of diverse sensory
impressions”. Soon, it begins to “analyse” this
sensory mosaic and forms the opinion that the
perceived impressions start from objective
constructs which partly belong to its own body
and partly are separated from it.”
The child begins thereupon to mentally subdivide the sensory mosaic into individual
objects (and Wüster stresses repeatedly in
this connection that objects in reality are
constructed by human beings, and that
there is a high degree of arbitrariness and
variability to such construction)
Initially the child deals only with
“individual objects”
Every object “is for the child something unique,
like a particular person.”
But the child can also remember objects, and a
memory that is not associated with sensory
impressions “constitutes a ‘concept’.
The concept of an individual is an ‘individual
concept’. Examples are: “‘Napoleon’ or the
concept of my fountain pen.” Concepts originate
in memory.
Then the child notices that there
are individual objects which are
“interchangeably alike”
e.g. apples or bricks or cans of paint
– objects which are also given the same
name by older speakers of the language.
Note that the thesis according to which
concepts are aquired via perception of
similars has also long since abandoned by
cognitive scientists [6].
Here general concepts enter the
scene
“The child learns to blend the individual
concepts of such objects in its thinking”
and thus arrives at general concepts,
which are, like individual concepts
“thought (=mental) objects. They exist only
in the heads of people.”
Communication
If “a speaker wishes to draw the attention
of an interlocutor to a particular individual
object, which is visible to both parties or
which he carries with him, he only has to
point to it”.
... Gulliver, The Academicians of Laggago
Recall: the Academicians of Lagago
held that since Words are only Names for
Things, it would be more convenient for all
Men to carry about them, such Things as
were necessary to express the particular
Business they are to discourse on …
which hath only this Inconvenience
attending it, that if a Man’s Business be
very great, and of various kinds, he must
be obliged in Proportion to carry a greater
bundle of Things upon his Back
Otherwise, “the only thing
available is the individual concept
of the object
, provided that it is readily accessible in the
heads of both persons.” (Those engaged
in communication about, say, Napoleon,
are thus somehow required to gain access
to the interiors of each other’s heads.)
individual concepts can be grouped
together into general concepts
“Several individual apples, for example,
provide together the general concept ‘apple’.
[This,] together with the concepts ‘pear’,
‘plum’ etc. then yield the superordinate
concept ‘fruit’.”
The formation of concepts at this level, too, is
“highly dependent on human discretion.”
so general concepts can be grouped
together into concepts of higher degrees
of abstraction.
Concepts and their extensions
Wüster hereby runs together individual
object and individual concept, as is
manifested in his notion of the extension of
a concept which he defines as both “the
totality of all subordinated concepts” and
“the totality of all individual objects which
fall under the concept”.
There is a ‘realm’ (Reich) of concepts
Terminology work is designed to provide
clear delineations of the concepts in this
realm, and only when such delineations
have been achieved can terms be
assigned.
Terms materialize concepts
Concepts can also be ‘materialised’
through token individuals, e.g. a chess
piece or a playing card, which are called
“representatives”. But the most important
materializations are signs:
Terms materialize concepts
A proper name such as ‘Napoleon’, for example,
can serve as a “substitute object”, which can be
used to bring the corresponding individual
concept “to the consciousness of the
interlocutor”.
The sign is available as a substitute object “if it is
a concept [sic] which can easily be materialised
at any time; that is, a suitable general concept ...
in the form of a phonetic or graphic sign. For a
phonetic concept (a phoneme or phoneme
combination) and a graphic sign concept can
easily be materialised at any time.”
In this way, objects and
concepts are confused not only
with each other, but also with signs.
For even signs are special kinds of
concepts, in Wüster’s thinking, in spite of
the fact that, as we recall, concepts “exist
only in the heads of people”.
Concepts and Characteristics
before we can assign a term to a concept we
must first “delineate” the concept, which means:
list the totality of characteristics which form what
he calls its content or intension.
The characteristics of the concept bulb are:
lamp
light-emitting stuff
solid stuff
emission of light through electrically
generated heat.
What are characteristics?
In some passages Wüster refers to them as
if they were themselves concepts (so that
characteristics, like concepts, would be in
the heads of people).
In other passages he refers to them
characteristics as if they were properties of
objects.
More recent ISO documents
have sought to resolve this conflict
ISO-1087:1990 defines a concept as: A
unit of thought constituted through
abstraction on the basis of properties
common to a set of objects. A
characteristic it defines as: A mental
representation of a property of an object
serving to form and delimit its concept.
ISO in 2000
Concept = A unit of knowledge created by a
unique combination of characteristics.
Characteristic = An abstraction of a property
of an object or of a set of objects.
Object = Anything perceivable or
conceivable (a unicorn being given as a
specific example of the latter).
The problem
The concept-based approach leaves
those involved in the authoring and
maintenance of terminologies unsure as
to whether their task is the representation
of ideas in people’s heads, or of
meanings of words, or of types of entities
and relations in the world.
SNOMED-CT:
“Disorders are concepts in which there is an
explicit or implicit pathological process
causing a state of disease which tends to
exist for a significant length of time under
ordinary circumstances.”
Given SNOMED’s definition of concepts as
“unique units of thought”, this would imply that
all disorders are imagined.
Wüsterianism in Medicine
Wüster: concepts are formed on the basis of
much human discretion and arbitrariness
 his ideas are well-suited to the area of medical
terminology, which is subject to the constant
coinage of novel terms.
But in medicine we often have to deal with
families of entities which manifest no or very few
characteristics “identifiable in encounters of
similars”, and certainly insufficiently many such
characteristics to allow definitions of
corresponding concepts.
Hence
some 85% of SNOMED-CT’s concepts
remain undefined.
A tumour starts out as (initially
undetectable) mutations in a small number
of cells and then becomes transformed by
degrees into a full-fledged object its own
right on the scale of coarse anatomy.
processes in medicine
embryological development, aging, the
history of a disease
No way to isolate in perception certain
“essential properties” which could be
identified as characteristics of
corresponding general concepts.
(Vienna Circle idea of logical reduction)
Wüster’s notion of concept
which underlies the terminology standards of
TC 37
has nothing to do with medicine at all.
He was concerned primarily with
standardization in the domain of artefacts,
of manufactured products
The Machine Tool. An Interlingual Dictionary
of Basic Concepts
Artefacts truly are such as to
manifest characteristics identifiable
in encounters of similars
– because they have been manufactured as
such.
Vocabulary itself is treated by Wüster and
his TC 37 followers “as if it could be
standardised in the same way as types of
paint and varnish [TC 35] or aircraft and
space vehicles [TC 20]” [12, p. 12].
Object
ISO/IEC JTC1 SC36 N0579:
an object is anything that can be perceived or
conceived.
Some objects, concrete objects such as a
machine, a diamond, or a river, shall be
considered material; other objects are to be
considered immaterial or abstract, such as each
manifestation of financial planning, gravity,
flowability, or a conversion ratio; still others are
to be considered purely imagined, for example,
a unicorn, a philosopher’s stone or a literary
character.
Are processes objects? Are they concrete or
abstract? Are characteristics objects? Are
concepts objects? Are dispositions,
functions, qualities, limbs, organs, bodily
cavities, blood flow, apoptosis objects? Are
they concrete or abstract? Material or
immaterial? Real or imagined?
Ontology = the task of creating a
coherent, principled framework
in which coherent answers to such
questions can be given is of increasing
importance to the future of medical coding
and of the EHR.
ISO makes clear its position as to
the importance of this task for the
future of terminology research:
In the course of producing a terminology,
philosophical discussions on whether an
object actually exists in reality are beyond
the scope of this standard and are to be
avoided. Objects are assumed to exist and
attention is to be focused on how one
deals with objects for the purposes of
communication.
Unfortunately
ISO’s definitions of terms like ‘object’ and
‘concept’ have been propagated in ever
wider circles through all subsequent
generations of relevant standards because
of ISO’s own rules governing re-use
But they are so vague as to leave the
putative user of the corresponding
standards entirely in the dark.
An Ontological Basis for Coding
Systems and the EHR
European and international efforts towards the standardization of
biomedical terminology and electronic healthcare records have been
stymied International Standard Bad Philosophy.
True, some critical remarks about certain conceptions in ISO TC 37
documents have been recently advanced, and the proposed alternative
certainly represents an advance on Wüster in its treatment of individual
objects. As concerns what is general, however, it still runs together objects
and concepts, identifying specific kinds or types of phenomena in the world
with the general concepts created by human beings. [14] In this way, like
Wüster, it leaves itself with no benchmark in relation to which given
concepts or concept-systems could be established as correct or incorrect.
Bacteria would still have properties different from those of trees if there
were no humans able to form the corresponding concepts. Now, however, it
is time to do better, and to absorb the best ontological theories and tools
which contemporary philosophy has to offer – and this means above all the
right sort of ontology, an ontology that is able explicitly and unambiguously
to relate the universal kinds or types in reality as well as to the individual
tokens which are their instances [15]. Such kinds or types are organism,
cell, neurulation, sleep, death.
It is the job of medical terminology
systems
to represent universals (types, kinds), not
concepts in people’s heads
a role must be played in improving
biomedical terminologies and coding
systems by the resolute imposition of a
coherent ontology of universals in place of
the obfuscations of Eugen Wüster.
Electronic Health Records
Our idea is that such an ontology will enable
us to introduce into EHR and coding
systems a coherent representation of the
different categories of entities in reality
and of the relations between them as a
substitute for the confused treatments of
‘object’, ‘concept’ and ‘characteristic’ that
have predominated hitherto.
In this way, it can help us also in ensuring
that the coding systems and terminologies
developed henceforth are compatible with
each other and with the EHRs which they
were designed to support.
Many hold that it will suffice to establish
communication standards for the EHR if
we can only establish a way to refer
unambiguously to “concepts” as units of
knowledge agreed upon by domain
experts and defined in formal ways.
But even under such ideal conditions the
focus on concepts would be misplaced.
To allow clinical data registered in EHRs to
be used for further automated processing,
it should be clear whether entities in the
associated coding system refer to
diseases, or to statements made about
diseases, to acts on the part of physicians,
or to documents in which such acts are
recorded, or to observations of such acts,
or to statements about such observations.
Applying a sound realist ontology to coding
systems and to EHR architectures
means in the first place ensuring that the
latter are calibrated not to the denizens of
Wüster’s “realm of concepts” but rather to
those entities in reality – such as particular
patients, diseases, therapies and the
universals which they instantiate – which
form the subject matter of healthcare.
Better documentation
coding systems built with the aid of a
robust realist ontology will be consistent –
not, perhaps, with information models
concocted by database designers from
afar – but rather with those
commonsensical intuitions about the
objects and processes in reality which are
shared by patients and healthcare
providers.
In sum
GO, UMLS, etc.
remain at the level of TERMINOLOGY
What we need is a REFERENCE
ONTOLOGY = a formal theory of the
foundational relations which hold
ONTOLOGIES together
The solution
we need to distinguish clearly between
concepts and universals:
concepts are creatures of cognition
universals are invariants (types, kinds,
universals) out there in reality
NCOR – The future
National Center for Ontological Research
Buffalo
Stanford
(OBO Consortium: Berkeley; Jackson
Labs, ...
(University of Washington, Seattle)...
(EBI, Cambridge UK; Swiss Bioinformatics
Institute, ...)
Note
we are not claiming that to establish the ontology
of the world of medical universals will be a
simple task. There is, it is clear, no single unified
perspective on which all reasonable persons
must agree if they would only open their eyes –
Hence the popularity of T. S. Kuhn’s ideas on
conflicting paradigms (and of Wüster’s own
ideas on the human-induced arbitrariness
involved in the “construction” of both objects and
concepts).
Against both Kuhn and Wüster
we accept existence of a plurality of
different perspectives on the world
(perspectives corresponding, for example,
to different life science disciplines, or to
different biomedical terminologies, or to
the different axes in SNOMED). But the
world itself is one.
Because of its immense complexity
this one world is accessible to us only in terms of a
wide variety of different sorts of complementary
perspectives. These different perspectives
correspond broadly to the concept-systems of
Wüster and his followers. But the latter’s running
together of concept and object means that they
lack any benchmark in relation to which the
integration of concept-systems could be
effected. For us, in contrast, the world itself is
able to serve also as benchmark for such
integration.
Thus first,
we need to bring down the
International Standards
Organization
The End
Download