The Information Artifact Ontology 1: Roots in BFO Barry Smith Standardized information artifact metadata: 1. hardware 3 IAO Clinical Continuant Independent Continuant Quality Occurrent Dependent Continuant Realizable Dependent Continuant Disposition Role Process 5 Types and Instances Geographic Coordinate s Set designate s Spatial Region instance_ of has location Distance Measurement Result Geopolitic al Entity has location designates instance_of Village Well Latrin e instance_of is_a Village Name instance_ of instance_o f instance_ of ’16 meters’ ‘VT 334 569’ measurement_of locate d near ‘Khanabad Village’ located in 6 Continuant Independent Continuant Quality Disposition Specifically Dependent Continuant Realizable Dependent Continuant Generically Dependent Continuant Gene Sequence Information Artifact Role 7 Specifically Dependent Continuants Specifically Dependent Continuant if any bearer ceases to exist, then the quality or function ceases to exist the color of my skin the function of my heart Quality, Pattern Realizable Dependent Continuant 8 Generically Dependent Continuants Generically Dependent Continuant if one bearer ceases to exist, then the entity can survive, because there are other bearers (copyability) the pdf file on my laptop Information Object Sequence the DNA (sequence) in this chromosome 9 Information artifacts pdf file email poem symphony algorithm symbol - can migrate from one information bearer to another 10 Continuant Independent Continuant Material Entity Specifically Dependent Continuant Quality Generically Dependent Continuant Gene Sequence Information Artifact Information Bearing Entity 11 Continuant Independent Continuant Material Entity Information Bearing Entity Specifically Dependent Continuant Quality depends_on Information Quality Entity Generically Dependent Continuant Information Artifact concretized_by 12 http://bioportal.bioontology.org/ontologies/IAO 13 IAO: information content entity =def. an entity that is generically dependent on some artifact and stands in the relation of aboutness to some entity Problems Is a work of fiction about something? Is a fake cover story for a fake terrorist about something? Is an erroneous entry in a database about something? 14 Generically dependent continuants such as plans, laws … are concretized in specifically dependent continuants (the plan in your head, the protocol being realized by your research team, the law being implemented by this government agency) 15 War and Peace is an instance Specifically Dependent Continuant Independent Continuant instance_of This bound copy of War and Peace instance_of War and Peace depends_on quality Generically Dependent Continuant instance_of The novel War and Peace 16 17 Instances vs Copies The novel War and Peace has many bound copies The quality spherical has many instances But having copies and having instances are two different things Information entities exist in a way which makes them dependent on provenance, and on processors, in a way in which types are not 18 What is a work of literature? Is War and Peace a type or an instance? • If War and Peace were a type, and the copies of War and Peace in my library and in your library were instances, then there would be many War(s) and Peaces. Hence War and Peace is an instance. 19 There are not two Declarations of Independence There can be two copies of the US Declaration of Independence There cannot be two US Declarations of Independence There cannot be subtypes of the US Declaration of Independence Hence the US Declaration of Independent is an instance and not a type. 20 Rule for types Their names are pluralizable There can be three people There cannot be three Michelle Obamas. Information Content Entities are GDCs = entities which can exist in many copies 21 Generically dependent continuants are distinct from types they have a different kind of provenance ◦ Aspirin as product of Bayer GmbH ◦ aspirin as molecular structure ◦ This Financial Report is submitted to the SEC 22 Information content entity prior intent to be directed directedne ss rulefor denotation governed communicatio as part ness n informs directedne ss soldier no yes yes no yes normal science yes yes yes yes yes doodle no no no no no fake message no yes yes no yes geez louise yes no no yes no googoogoo no no no yes no Passport vs. Boarding Pass Your passport can be copied, but a copy of your passport cannot be your passport Boarding pass can be copied, and a copy of your boarding pass can be your boarding pass 24 Terminology of types and tokens, vs. terminology of types and instances 25 Generically dependent continuants are concretized in specifically dependent continuants Beethoven’s 9th Symphony is concretized in the pattern of ink marks which make up this score in my hand – this is an information quality entity: a BFO:quality of the material (information bearing entity) that is the score 26 Generically dependent continuants (GDCs) can be concretized in multiple different media (paper, silicon, neuron …) 27 Information Content Entity (science) protocol database theory ontology gene list publication result ... 29 Information Content Entity (labeling) serial number batch number grant number person number name address email address URL ... 30 Information Content Entity (Finance) • • • Financial Report Financial Report in XBRL for submission to GAAP Business Report 31 Type or instance Continuant Independent Continuant human being, protocol document Dependent Continuant pattern of ink marks Occurrent (Process) Applying the protocol Side-Effect … ... .. ..... .... ..... 32 Continuant Independent Continuant Occurrent Dependent Continuant Information Content Entity Action creating a datum .... ..... ....... 33 Generically dependent continuants do not require specific media (paper, silicon, neuron …) 34 Generically Dependent Continuants Generically Dependent Continuant Information Content Entity .pdf file Gene Sequence .doc file instances 35 Generically dependent continuants are concretized in specifically dependent continuants Beethoven’s 9th Symphony is concretized in the pattern of ink marks which make up this score in my hand 36 Purpose of an Information Artifact Descriptive purpose =def. the purpose of describing some portion of reality Examples: scientific paper, newspaper article, diary, experimenter log notebook Prescriptive purpose =def. the purpose of prescribing or permitting or allowing some activity Examples: a legal code, a license 37 Purpose of an Information Artifact Directive purpose =def. the purpose of specifying a plan or method for achieving something Examples: instruction, manual, recipe, protocol Designative purpose =def. the purpose of uniquely designating some entity or the members of some class of entities Examples: a registry of members of an organization, a phone book, a database linking proper names of persons with their social security numbers. 38 40 Steps towards an email ontology • • • • message has_part header section and body section header section has_part a collection of header fields header field contains a header name and a header body header body may have additional structure based on the header in question • body may have nested structure and attachments based on MIME • the body may contain a text version, an HTML version, or both • the body may contain attachments (files such as images, documents, other emails, etc • header fields may use MIME to include header information in other languages/charsets Steps towards an email ontology email may have_status draft, sent email may addressee may be in to: field, cc: field, bcc: field email may be forwarded email may be read, unread email may have priority label … 42 E-mail Header 43 Email Address Field A field is an information structure entity (comparable to cell, margin, space between words, period, comma, etc.) This means it is not about anything. Nearly all information content entities have fields as parts Address field is an information content entity which has a field as part But address field is about (in some very attenuated sense) the type: address Similarly the field in a spreadsheet where you fill in the measurement unit used is an ICE, because it is (in this same attenuated sense) the type: measurement unit. When you fill in the actual address then the resultant field is an ICE which is about that actual address BS 44 Information Artifact Ontology 2: Aboutness 45 Shimon Edelman’s Riddle of Representation two humans, a monkey, and a robot are looking at a piece of cheese; what is common to the representational processes in their visual systems? 46 Answer: The cheese, of course 47 The real cheese 48 the arrow of intentionality 49 ± simple mental process content (putative) target content of presentation presenting act object of presentation “apple” judgment-content judging act “the apple over there is ripe” evaluating act emotional act appraisal … “it is good that the apple over there is ripe” state of affairs fact ? ± relational intentionality mental process content target you see an apple “apple” an apple • you are in physical contact with target ― cf. Russell’s knowledge by acquaintance; J. J. Gibson’s ecological theory of perception ± perceptually filled mental process content (putative) target sensory content object of presentation presenting act object present object absent ordinary perception object exists object does not exist perceptually filled does not imply veridical mental process content (putative) target sensory content object of presentation presenting act object present object exists object absent hallucination object does not exist the evolutionarily most basic case mental process content presenting act content of presentation (putative) target object of presentation object present object absent “apple” + sensation originating causally at target ordinary perception object exists object does not exist relational implies veridical mental process content presenting act content of presentation (putative) target object of presentation object present object absent “apple” + sensation originating causally at target ordinary perception object exists object does not exist veridical does not imply relational mental process content presenting act content of presentation (putative) target object of presentation “apple” object present object absent veridical thinking about object exists object does not exist ± content match mental process content (putative) target presenting act content of presentation object of presentation object present object absent “apple” object exists content match “apple” content match “food” veridical does not imply content match mental process content (putative) target presenting act content of presentation object of presentation object present object absent “apple” object exists content mismatch “poison” content mismatch “apple” still posson content here not just a matter of language ± linguistically mediated mental process content target you see an apple “apple” an apple A cat can see a king A cat can see a mass spectrometer non-veridical intentionality is an untidy collection of non-canonical cases mental process content presenting act content of presentation underlying false belief “apple” there is no target the presenting act is dependent on an underlying belief or attitude of one or other deviant types non-veridical intentionality type 1. ontological error mental process content presenting act content of presentation object present object absent (putative) target “apple” hallucination, deception, … the presenting act is dependent on a false underlying belief non-veridical intentionality type 2. fiction mental process content presenting act content of presentation object present object absent (putative) target “apple” thinking-about-Macbeth = the presenting act is not dependent on an underlying false belief “The Substitution Theory of Art”, Grazer Philosophische Studien, 25/26 (1986) the primacy of language (Sellars …) mental experiences are about objects because words have meaning word / meaning 68 the primacy of the intentional (Brentano, Husserl, …): linguistic expressions have meanings because there are (‘animating’) mental experiences which have aboutness 69 dimension of content / belief prior to dimension of language language comes later than mental aboutness 71 How annotate this 72 or this? 73 or this? 74 Mental Functioning Ontology (Draft) Mental Functioning Ontology (Draft) with thanks to Janna Hastings and Kevin Mulligan Swiss Center for Affective Sciences) Basic Formal Ontology BFO:Entity BFO BFO:Continuant BFO:Independent Continuant BFO:Dependent Continuant BFO:Occurrent BFO:Process BFO:Disposition 77 Basic Formal Ontology and Mental Functioning Ontology (MFO) BFO:Entity BFO BFO:Continuant BFO:Independent Continuant BFO:Occurrent BFO:Dependent Continuant MFO BFO:Process Bodily Process Organism BFO:Disposition BFO:Quality Mental Functioning Related Anatomical Structure Cognitive Representation Mental Process Behaviour inducing state Affective Representation 78 Functions vs. Functionings Continuants vs. Occurrents BFO:Entity BFO BFO:Continuant BFO:Independent Continuant Organism BFO:Occurrent BFO:Dependent Continuant MFO BFO:Process Bodily Process BFO:Disposition BFO:Quality Cognitive Representation Mental Process Mental Function Mental Functioning 79 Aboutness (‘Intentionality’) BFO:Entity BFO BFO:Continuant BFO:Independent Continuant Organism BFO:Occurrent MFO BFO:Process BFO:Dependent Continuant Bodily Process BFO:Disposition BFO:Quality Cognitive Representation Mental Function Mental Process Mental Functioning does all mental functioning involve cognitive representation (aboutness)? what is aboutness? 80 Extending the MFO • to linguistic competence and performance 81 Linguistic Functioning Ontology (1. Speech and hearing) BFO:Entity BFO BFO:Continuant BFO:Independent Continuant BFO:Occurrent BFO:Dependent Continuant MFO BFO:Process Bodily Process BFO:Disposition BFO:Quality Linguistic competence Behaviour inducing state Speech competence of a population = a [spoken] language Speech competence of an individual Cognitive Representation Speech process Speechmediated cognitive representation Hearing (registering) 82 process 82 Linguistic Functioning Ontology (2. Reading and writing) BFO:Entity BFO BFO:Continuant BFO:Independent Continuant BFO:Occurrent BFO:Dependent Continuant MFO BFO:Process Bodily Process BFO:Disposition BFO:Quality Linguistic competence Behaviour inducing state Written linguistic competence of a population = a [written] language Written linguistic competence of an individual Cognitive Representation Writing process Writtenlanguagemediated cognitive representation Reading (registering) 83 process 83 Linguistic Functioning Ontology (the whole thing) BFO:Entity BFO BFO:Continuant BFO:Independent Continuant BFO:Occurrent BFO:Dependent Continuant MFO BFO:Process Bodily Process BFO:Disposition BFO:Quality Linguistic competence Behaviour inducing state Linguistic competence of a population = a language Linguistic competence of an individual Cognitive Representation Writing Languagemediated cognitive representation Speaking Reading 84 84 non-veridical intentionality type 3. planning mental process content presenting act content of presentation object present object absent “apple” Christmas present lists (putative) target non-veridical intentionality type 4. daydreaming mental process content presenting act content of presentation object present object absent “apple” (putative) target Mental Functioning Ontology (MF) brainin endocrine gland 88 brain retina ENVIRONMENT Aboutness 89 mental act about a real-world object relational (~ perception) content match content mismatch veridical non-relational (~ linguistic) content match content mismatch non-veridical 90 Veridical intentionality mental process content presenting act content of presentation (putative) target object of presentation “apple” target present target absent object exists object does not exist ordinary perception evolutionarily most basic case 91 92 what is a language? something analogous to a biological species (a population of competences) BFO:Entity BFO BFO:Continuant BFO:Independent Continuant BFO:Occurrent BFO:Dependent Continuant MFO BFO:Process Bodily Process BFO:Disposition BFO:Quality Linguistic competence Behaviour inducing state Linguistic competence of a population = a language Linguistic competence of an individual Cognitive Representation Writing Languagemediated cognitive representation Speaking Reading 93 93 The Information Artifact Ontology 3: Dublin Core Barry Smith The problem • Keeping track of data; finding data • Information artefacts = carriers of data/information, for example reports • Data have metadata – date created, author … • To solve the problem of keeping track of data we need to address – 1. what are the data about data topics – 2. how the data are packaged (collected, presented, formatted, stored) resources, information artifacts RDF = Resource Description Framework What is a ‘resource’? Dublin Core Elements & Uses http://dublincore.org/ 15 metadata elements for the description of resources… especially digital resources. Jody DeRidder, Digital Libraries IS 565, Spring 2007 1) What’s a “resource”? A resource is anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. 2) How do “elements” apply to “resources”? An Element is a characteristic that a resource may “have”, such as a Title, Publisher, or Subject. 3) What if I have more than one version of this resource? The same resource can be instantiated in different ways Language: A language of the resource. Recommended best practice is to use a controlled vocabulary such as ISO 639-2. Example: “eng” for English. Date: A date associated with the creation or availability of the resource. Recommended best practice is defined in a profile of ISO 8601 that includes (among others) dates of the forms YYYY and YYYY-MM-DD. Format: The file format, physical medium, or dimensions of the resource. Examples of dimensions include size and duration. Recommended best practice is to use a controlled vocabulary such as the list of Internet Media Types [MIME]. Example: image/jpeg. http://www.referenttracking.com/RTU/?page=ceusters_vita http://www.referenttracking.com/RTU/?page=ceusters_vita 2 seconds later what describes the content / topic / subjectmatter? Title: The name given to the resource. Description: An account of the content of the resource. Description may include but is not limited to: an abstract, table of contents, reference to a graphical representation of content or a free-text account of the content. Subject: The topic of the content of the resource. Typically, a subject will be expressed as keywords or key phrases or classification codes that describe the topic of the resource. Source: A reference to a resource from which the present resource is derived. The present resource may be derived from the Source resource in whole or part. Type: The nature or genre of the content of the resource. Type includes terms describing general categories, functions, genres, or aggregation levels for content. what describes who made it? Creator: An entity primarily responsible for making the content of the resource. Examples of a Creator include a person, an organization, or a service. Contributor: An entity responsible for making contributions to the content of the resource. Examples of a Contributor include a person, an organization or a service. Typically, the name of a Contributor should be used to indicate the entity. Publisher: The entity responsible for making the resource available. Examples of a Publisher include a person, an organization, or a service. Typically, the name of a Publisher should be used to indicate the entity. All 15 elements of Simple Dublin Core Date Title Relation Instantiation: Format Identifier Language Content: Description Coverage Source Subject Type Intellectual Property: Contributor Creator Publisher Rights Some example qualifiers… Type of Qualifier Element Example Qualifiers Element Refinement Description Abstract, tableOfContents Coverage Spatial, Temporal Date Available, Created, dateCopyrighted, dateAccepted, dateSubmitted Relation hasPart, hasVersion, isPartOf, isReferencedBy, isReplacedby, isVersionOf Subject DDC (Dewey Decimal Classification), LCC (Library of Congress Classification), LCSH (Library of Congress Subject Headings), MESH (Medical Subject Headings)… Language ISO639-2 (such as eng, for English), RFC1766 (such as en-us for US English) Encoding Schemes Date Type W3CDTF (such as 1997-12-04 for 4 Dec. 1997) DCMIType, such as: Collection, Dataset, Event, Image, InteractiveResource, MovingImage, PhysicalObject, Service, Software, Sound, StillImage, Text. Example online represenatation • http://dublincore.org/documents/2012/06/14 /dcmi-terms/#terms-abstract Dublin Core (~ OWL) Properties Terms Classes BFO Relations Types Instances from Bill Mandrick Geographic Coordinates Set instance_of has location designates Spatial Region Distance Measurement Result Geopolitical Entity has location is_a Village Name designates Village Well Latrine instance_of instance_of instance_of instance_of instance_of ’16 meters’ ‘VT 334 569’ measurement_of located near ‘Khanabad Village’ located in 112