IAO-Nov-12-2013 - Buffalo Ontology Site

advertisement
The Information Artifact
Ontology 1: Roots in BFO
Barry Smith
Standardized information
artifact metadata:
1. hardware
3
IAO Clinical
Continuant
Independent
Continuant
Quality
Occurrent
Dependent
Continuant
Realizable
Dependent
Continuant
Disposition
Role
Process
5
Types and Instances
Geographic
Coordinate
s Set
designate
s
Spatial
Region
instance_
of
has
location
Distance
Measurement
Result
Geopolitic
al
Entity
has
location
designates
instance_of
Village
Well
Latrin
e
instance_of
is_a
Village
Name
instance_
of
instance_o
f
instance_
of
’16 meters’
‘VT 334
569’
measurement_of
locate
d near
‘Khanabad
Village’
located
in
6
Continuant
Independent
Continuant
Quality
Disposition
Specifically
Dependent
Continuant
Realizable
Dependent
Continuant
Generically
Dependent
Continuant
Gene
Sequence
Information
Artifact
Role
7
Specifically Dependent Continuants
Specifically
Dependent
Continuant
if any bearer ceases to exist,
then the quality or function
ceases to exist
the color of my skin
the function of my heart
Quality, Pattern
Realizable
Dependent
Continuant
8
Generically Dependent Continuants
Generically
Dependent
Continuant
if one bearer ceases to exist, then
the entity can survive, because
there are other bearers
(copyability)
the pdf file on my laptop
Information
Object
Sequence
the DNA (sequence) in this
chromosome
9
Information artifacts
pdf file
email
poem
symphony
algorithm
symbol
- can migrate from one information bearer
to another
10
Continuant
Independent
Continuant
Material
Entity
Specifically
Dependent
Continuant
Quality
Generically
Dependent
Continuant
Gene
Sequence
Information
Artifact
Information
Bearing
Entity
11
Continuant
Independent
Continuant
Material
Entity
Information
Bearing
Entity
Specifically
Dependent
Continuant
Quality
depends_on
Information
Quality
Entity
Generically
Dependent
Continuant
Information
Artifact
concretized_by
12
http://bioportal.bioontology.org/ontologies/IAO
13
IAO: information content entity
=def. an entity that is generically
dependent on some artifact and stands in
the relation of aboutness to some entity
Problems
Is a work of fiction about something?
Is a fake cover story for a fake terrorist about
something?
Is an erroneous entry in a database about
something?
14
Generically dependent continuants
such as plans, laws …
are concretized in specifically dependent
continuants
(the plan in your head, the protocol being
realized by your research team, the law
being implemented by this government
agency)
15
War and Peace is an instance
Specifically
Dependent
Continuant
Independent
Continuant
instance_of
This bound
copy of
War and Peace
instance_of
War and Peace
depends_on
quality
Generically
Dependent
Continuant
instance_of
The novel
War and Peace
16
17
Instances vs Copies
The novel War and Peace has many bound
copies
The quality spherical has many instances
But having copies and having instances are two
different things
Information entities exist in a way which makes
them dependent on provenance, and on
processors, in a way in which types are not
18
What is a work of literature?
Is War and Peace a type or an
instance?
•
If War and Peace were a type, and the
copies of War and Peace in my library and
in your library were instances, then
there would be many War(s) and
Peaces.
Hence War and Peace is an instance.
19
There are not two Declarations of
Independence
There can be two copies of the US Declaration
of Independence
There cannot be two US Declarations of
Independence
There cannot be subtypes of the US
Declaration of Independence
Hence the US Declaration of Independent is an
instance and not a type.
20
Rule for types
Their names are pluralizable
There can be three people
There cannot be three Michelle Obamas.
Information Content Entities are GDCs =
entities which can exist in many copies
21
Generically dependent continuants
are distinct from types
they have a different kind of
provenance
◦ Aspirin as product of Bayer GmbH
◦ aspirin as molecular structure
◦ This Financial Report is submitted to the
SEC
22
Information content entity
prior
intent to
be
directed
directedne
ss
rulefor
denotation
governed communicatio as part
ness
n
informs
directedne
ss
soldier
no
yes
yes
no
yes
normal
science
yes
yes
yes
yes
yes
doodle
no
no
no
no
no
fake
message
no
yes
yes
no
yes
geez louise yes
no
no
yes
no
googoogoo no
no
no
yes
no
Passport vs. Boarding Pass
Your passport can be copied, but a
copy of your passport cannot be your
passport
Boarding pass can be copied, and a
copy of your boarding pass can be
your boarding pass
24
Terminology of types and
tokens, vs. terminology of
types and instances
25
Generically dependent continuants
are concretized in specifically dependent
continuants
Beethoven’s 9th Symphony is concretized in the
pattern of ink marks which make up this score in
my hand – this is an information quality entity: a
BFO:quality of the material (information bearing
entity) that is the score
26
Generically dependent continuants
(GDCs)
can be concretized in multiple different
media (paper, silicon, neuron …)
27
Information Content Entity (science)
protocol
database
theory
ontology
gene list
publication
result
...
29
Information Content Entity
(labeling)
serial number
batch number
grant number
person number
name
address
email address
URL
...
30
Information Content Entity
(Finance)
•
•
•
Financial Report
Financial Report in XBRL for
submission to GAAP
Business Report
31
Type or instance
Continuant
Independent
Continuant
human being,
protocol
document
Dependent
Continuant
pattern of
ink marks
Occurrent
(Process)
Applying
the protocol
Side-Effect …
... .. ..... .... .....
32
Continuant
Independent
Continuant
Occurrent
Dependent
Continuant
Information
Content
Entity
Action
creating a datum
.... ..... .......
33
Generically dependent continuants
do not require specific media (paper,
silicon, neuron …)
34
Generically Dependent Continuants
Generically
Dependent
Continuant
Information
Content Entity
.pdf file
Gene
Sequence
.doc file
instances
35
Generically dependent continuants
are concretized in specifically
dependent continuants
Beethoven’s 9th Symphony is
concretized in the pattern of ink
marks which make up this score in
my hand
36
Purpose of an Information Artifact
Descriptive purpose
=def. the purpose of describing some portion of
reality
Examples: scientific paper, newspaper article,
diary, experimenter log notebook
Prescriptive purpose
=def. the purpose of prescribing or permitting or
allowing some activity
Examples: a legal code, a license
37
Purpose of an Information Artifact
Directive purpose
=def. the purpose of specifying a plan or method
for achieving something
Examples: instruction, manual, recipe, protocol
Designative purpose
=def. the purpose of uniquely designating some
entity or the members of some class of entities
Examples: a registry of members of an
organization, a phone book, a database linking
proper names of persons with their social
security numbers.
38
40
Steps towards an email ontology
•
•
•
•
message has_part header section and body section
header section has_part a collection of header fields
header field contains a header name and a header
body
header body may have additional structure based on
the header in question
• body may have nested structure and attachments based on
MIME
• the body may contain a text version, an HTML version, or
both
• the body may contain attachments (files such as images,
documents, other emails, etc
•
header fields may use MIME to include header
information in other languages/charsets
Steps towards an email ontology
email may have_status draft, sent
email may addressee may be in to:
field, cc: field, bcc: field
email may be forwarded
email may be read, unread
email may have priority label
…
42
E-mail Header
43
Email Address Field
A field is an information structure entity
(comparable to cell, margin, space between words, period, comma, etc.)
This means it is not about anything.
Nearly all information content entities have fields as parts
Address field is an information content entity which has a field as part
But address field is about (in some very attenuated sense) the type: address
Similarly the field in a spreadsheet where you fill in the measurement unit
used is an ICE, because it is (in this same attenuated sense) the type:
measurement unit.
When you fill in the actual address then the resultant field is an ICE which is
about that actual address
BS
44
Information Artifact Ontology 2:
Aboutness
45
Shimon Edelman’s
Riddle of Representation
two humans, a monkey, and a robot
are looking at a piece of cheese;
what is common to the
representational processes in their
visual systems?
46
Answer:
The cheese, of course
47
The real cheese
48
the arrow of intentionality
49
± simple
mental process
content
(putative) target
content of presentation
presenting act
object of presentation
“apple”
judgment-content
judging act
“the apple over there is
ripe”
evaluating act
emotional act
appraisal
…
“it is good that the apple
over there is ripe”
state of affairs
fact
?
± relational intentionality
mental process
content
target
you see an apple
“apple”
an apple
• you are in physical contact with target
― cf. Russell’s knowledge by acquaintance;
J. J. Gibson’s ecological theory of perception
± perceptually filled
mental process
content
(putative) target
sensory content
object of presentation
presenting act
object
present
object
absent
ordinary perception
object
exists
object does
not exist
perceptually filled does not imply
veridical
mental process
content
(putative) target
sensory content
object of presentation
presenting act
object
present
object
exists
object
absent
hallucination
object does
not exist
the evolutionarily most basic case
mental process
content
presenting act
content of presentation
(putative) target
object of presentation
object
present
object
absent
“apple” + sensation
originating causally at
target
ordinary perception
object
exists
object does
not exist
relational implies veridical
mental process
content
presenting act
content of presentation
(putative) target
object of presentation
object
present
object
absent
“apple” + sensation
originating causally at
target
ordinary perception
object
exists
object does
not exist
veridical does not imply relational
mental process
content
presenting act
content of presentation
(putative) target
object of presentation
“apple”
object
present
object
absent
veridical thinking about
object
exists
object does
not exist
± content match
mental process
content
(putative) target
presenting act
content of presentation
object of presentation
object
present
object
absent
“apple”
object exists
content match
“apple”
content match
“food”
veridical does not imply content match
mental process
content
(putative) target
presenting act
content of presentation
object of presentation
object
present
object
absent
“apple”
object exists
content mismatch
“poison”
content mismatch
“apple”
still
posson
content here not just a matter of language
± linguistically mediated
mental process
content
target
you see an apple
“apple”
an apple
A cat can see a king
A cat can see a mass spectrometer
non-veridical intentionality is an untidy
collection of non-canonical cases
mental process
content
presenting act
content of presentation
underlying false belief
“apple”
there is no target
the presenting act is dependent on an
underlying belief or attitude of one or
other deviant types
non-veridical intentionality
type 1. ontological error
mental process
content
presenting act
content of presentation
object
present
object
absent
(putative) target
“apple”
hallucination, deception, …
the presenting act is dependent on a
false underlying belief
non-veridical intentionality
type 2. fiction
mental process
content
presenting act
content of presentation
object
present
object
absent
(putative) target
“apple”
thinking-about-Macbeth = the
presenting act is not dependent on
an underlying false belief
“The Substitution Theory of Art”, Grazer Philosophische Studien, 25/26 (1986)
the primacy of language (Sellars …)
mental experiences are about objects because
words have meaning
word / meaning
68
the primacy of the intentional (Brentano,
Husserl, …):
linguistic expressions have meanings because there
are (‘animating’) mental experiences which have
aboutness
69
dimension of content / belief prior to
dimension of language
language comes later than mental
aboutness
71
How annotate this
72
or this?
73
or this?
74
Mental Functioning Ontology (Draft)
Mental Functioning Ontology (Draft)
with thanks to Janna Hastings and Kevin Mulligan
Swiss Center for Affective Sciences)
Basic Formal Ontology
BFO:Entity
BFO
BFO:Continuant
BFO:Independent
Continuant
BFO:Dependent
Continuant
BFO:Occurrent
BFO:Process
BFO:Disposition
77
Basic Formal Ontology
and Mental Functioning Ontology (MFO)
BFO:Entity
BFO
BFO:Continuant
BFO:Independent
Continuant
BFO:Occurrent
BFO:Dependent
Continuant
MFO
BFO:Process
Bodily Process
Organism
BFO:Disposition
BFO:Quality
Mental Functioning
Related Anatomical
Structure
Cognitive
Representation
Mental Process
Behaviour
inducing state
Affective
Representation
78
Functions vs. Functionings
Continuants vs. Occurrents
BFO:Entity
BFO
BFO:Continuant
BFO:Independent
Continuant
Organism
BFO:Occurrent
BFO:Dependent
Continuant
MFO
BFO:Process
Bodily Process
BFO:Disposition
BFO:Quality
Cognitive
Representation
Mental Process
Mental
Function
Mental
Functioning
79
Aboutness (‘Intentionality’)
BFO:Entity
BFO
BFO:Continuant
BFO:Independent
Continuant
Organism
BFO:Occurrent
MFO
BFO:Process
BFO:Dependent
Continuant
Bodily Process
BFO:Disposition
BFO:Quality
Cognitive
Representation
Mental
Function
Mental Process
Mental
Functioning
does all mental functioning involve cognitive
representation (aboutness)?
what is aboutness?
80
Extending the MFO
• to linguistic competence and performance
81
Linguistic Functioning Ontology
(1. Speech and hearing)
BFO:Entity
BFO
BFO:Continuant
BFO:Independent
Continuant
BFO:Occurrent
BFO:Dependent
Continuant
MFO
BFO:Process
Bodily Process
BFO:Disposition
BFO:Quality
Linguistic
competence
Behaviour
inducing state
Speech competence of a
population
= a [spoken] language
Speech competence of
an individual
Cognitive
Representation
Speech
process
Speechmediated
cognitive
representation
Hearing
(registering)
82
process
82
Linguistic Functioning Ontology
(2. Reading and writing)
BFO:Entity
BFO
BFO:Continuant
BFO:Independent
Continuant
BFO:Occurrent
BFO:Dependent
Continuant
MFO
BFO:Process
Bodily Process
BFO:Disposition
BFO:Quality
Linguistic
competence
Behaviour
inducing state
Written linguistic
competence of a
population
= a [written] language
Written linguistic
competence of an
individual
Cognitive
Representation
Writing
process
Writtenlanguagemediated
cognitive
representation
Reading
(registering)
83
process
83
Linguistic Functioning Ontology
(the whole thing)
BFO:Entity
BFO
BFO:Continuant
BFO:Independent
Continuant
BFO:Occurrent
BFO:Dependent
Continuant
MFO
BFO:Process
Bodily Process
BFO:Disposition
BFO:Quality
Linguistic
competence
Behaviour
inducing state
Linguistic competence
of a population
= a language
Linguistic competence
of an individual
Cognitive
Representation
Writing
Languagemediated
cognitive
representation
Speaking
Reading
84
84
non-veridical intentionality
type 3. planning
mental process
content
presenting act
content of presentation
object
present
object
absent
“apple”
Christmas present lists
(putative) target
non-veridical intentionality
type 4. daydreaming
mental process
content
presenting act
content of presentation
object
present
object
absent
“apple”
(putative) target
Mental Functioning Ontology (MF)
brainin
endocrine
gland
88
brain
retina
ENVIRONMENT
Aboutness
89
mental act about a
real-world object
relational
(~ perception)
content
match
content
mismatch
veridical
non-relational
(~ linguistic)
content
match
content
mismatch
non-veridical
90
Veridical intentionality
mental process
content
presenting act
content of presentation
(putative) target
object of presentation
“apple”
target
present
target
absent
object
exists
object does
not exist
ordinary perception
evolutionarily most basic case
91
92
what is a language?
something analogous to a biological
species (a population of competences)
BFO:Entity
BFO
BFO:Continuant
BFO:Independent
Continuant
BFO:Occurrent
BFO:Dependent
Continuant
MFO
BFO:Process
Bodily Process
BFO:Disposition
BFO:Quality
Linguistic
competence
Behaviour
inducing state
Linguistic competence
of a population
= a language
Linguistic competence
of an individual
Cognitive
Representation
Writing
Languagemediated
cognitive
representation
Speaking
Reading
93
93
The Information Artifact
Ontology 3: Dublin Core
Barry Smith
The problem
• Keeping track of data; finding data
• Information artefacts = carriers of
data/information, for example reports
• Data have metadata – date created, author …
• To solve the problem of keeping track of data we
need to address
– 1. what are the data about  data topics
– 2. how the data are packaged (collected, presented,
formatted, stored)  resources, information artifacts
RDF = Resource Description Framework
What is a ‘resource’?
Dublin Core
Elements & Uses
http://dublincore.org/
15 metadata elements for the
description of resources… especially
digital resources.
Jody DeRidder, Digital Libraries IS 565, Spring 2007
1) What’s a “resource”?
 A resource is anything that has identity. Familiar examples include an
electronic document, an image, a service (e.g., "today's weather report for
Los Angeles"), and a collection of other resources.
2) How do “elements” apply to “resources”?
 An Element is a characteristic that a resource may
“have”, such as a Title, Publisher, or Subject.
3) What if I have more than one
version of this resource?
The same resource can be instantiated in different ways
Language: A language of the resource. Recommended best
practice is to use a controlled vocabulary such as ISO 639-2.
Example: “eng” for English.
Date: A date associated with the creation or availability of the
resource. Recommended best practice is defined in a profile
of ISO 8601 that includes (among others) dates of the forms
YYYY and YYYY-MM-DD.
Format: The file format, physical medium, or dimensions of
the resource. Examples of dimensions include size and
duration. Recommended best practice is to use a controlled
vocabulary such as the list of Internet Media Types [MIME].
Example: image/jpeg.
http://www.referenttracking.com/RTU/?page=ceusters_vita
http://www.referenttracking.com/RTU/?page=ceusters_vita
2 seconds later
what describes the content / topic / subjectmatter?
Title: The name given to the resource.
Description: An account of the content of the resource.
Description may include but is not limited to: an abstract,
table of contents, reference to a graphical representation of
content or a free-text account of the content.
Subject: The topic of the content of the resource. Typically, a
subject will be expressed as keywords or key phrases or
classification codes that describe the topic of the resource.
Source: A reference to a resource from which the present
resource is derived. The present resource may be derived
from the Source resource in whole or part.
Type: The nature or genre of the content of the resource.
Type includes terms describing general categories,
functions, genres, or aggregation levels for content.
what describes who made it?
Creator: An entity primarily responsible for making the
content of the resource. Examples of a Creator include a
person, an organization, or a service.
Contributor: An entity responsible for making contributions to
the content of the resource. Examples of a Contributor include
a person, an organization or a service. Typically, the name of a
Contributor should be used to indicate the entity.
Publisher: The entity responsible for making the resource
available. Examples of a Publisher include a person, an
organization, or a service. Typically, the name of a Publisher
should be used to indicate the entity.
All 15 elements of Simple Dublin Core
Date
Title
Relation
Instantiation:
Format
Identifier
Language
Content:
Description
Coverage
Source
Subject
Type
Intellectual Property:
Contributor
Creator
Publisher
Rights
Some example qualifiers…
Type of
Qualifier
Element
Example Qualifiers
Element
Refinement
Description
Abstract, tableOfContents
Coverage
Spatial, Temporal
Date
Available, Created, dateCopyrighted, dateAccepted,
dateSubmitted
Relation
hasPart, hasVersion, isPartOf, isReferencedBy,
isReplacedby, isVersionOf
Subject
DDC (Dewey Decimal Classification),
LCC (Library of Congress Classification),
LCSH (Library of Congress Subject Headings),
MESH (Medical Subject Headings)…
Language
ISO639-2 (such as eng, for English), RFC1766 (such as
en-us for US English)
Encoding
Schemes
Date
Type
W3CDTF (such as 1997-12-04 for 4 Dec. 1997)
DCMIType, such as: Collection, Dataset, Event, Image,
InteractiveResource, MovingImage, PhysicalObject,
Service, Software, Sound, StillImage, Text.
Example online represenatation
• http://dublincore.org/documents/2012/06/14
/dcmi-terms/#terms-abstract
Dublin Core (~
OWL)
Properties
Terms
Classes
BFO
Relations
Types
Instances
from Bill Mandrick
Geographic
Coordinates
Set
instance_of
has
location
designates
Spatial
Region
Distance
Measurement
Result
Geopolitical
Entity
has
location
is_a
Village
Name
designates
Village
Well
Latrine
instance_of
instance_of
instance_of
instance_of
instance_of
’16 meters’
‘VT 334 569’
measurement_of
located
near
‘Khanabad
Village’
located
in
112
Download