Introduction to Ontology Development and Tools Part I: First Steps in Ontology Development ICBO 2011, Buffalo, July 26, 2011 Mathias Brochhausen1 & Amanda Hicks2 1 UAMS, Little Rock, AR 2 University at Buffalo, Buffalo, NY 1 Contents I. II. III. IV. V. VI. Introduction Terminological considerations Creating the hierarchy of classes Creating relations (Instances) Creating restrictions on classes 2 I. Introduction What is Protégé? We are concerned here with ‟Protégé-OWL editor”. ‟The Protégé-OWL editor enables users to build ontologies for the Semantic Web, in particular in the W3C‘s Web Ontology Language (OWL)”. 3 I. Introduction Keep in mind that Protégé is... ...not a programming language. ...only a tool. It will not prevent you from making mistakes. Notice that we will be referring to Protégé 4.1 in this and the following hands-on exercise. 4 II. Terminological considerations a. OWL/Protégé terminology b. Preferable terminology 5 IIa.OWL/Protégé terminology Classes: ‟OWL classes are interpreted as sets.” Primitive Classes: ‟Classes that only have necessary conditions”; Example: Animal, Cat… Defined Classes: ‟A class that has at least one necessary and sufficient condition”; Example: All things that are biped and lack feathers. 6 II.a OWL/Protégé terminology Properties: ‟binary relation between individuals” Object Properties; Examples: is_part_of, is_kissing Datatype Properties; Examples: has_DateValue, has StringValue 7 II.a OWL/Protégé terminology Individuals: ‟represent objects in the domain we are interested in”. Examples: me, the Pentagon in Washington, Jane Doe‘s lung. 8 II.b Preferable terminology a) Instance: An individual or particular which instantiates a universal; Examples: me, the Pentagon in Washington, Jane Doe‘s lung b) Universal: A universal is something that is shared in common by all those particulars which are its instances, Examples: Animal, Human being, Building, Human lung 9 II.b Preferable terminology c) Attributive Collection: An attributive collection is a collection (a set) of particulars which share a common property; Examples: All patients suffering from breast cancer in the Mayo Clinic in 2009, all humans who have been tested positive for HIV, all bacteria in the petri dish over there. 10 II.b Preferable terminology d) Relation: Relations exist mutually between universals and universals, between universals and particulars, and between particulars and particulars; Examples: is_subtype_to, is_part_of, is_instance_of, is_kissing 11 II.b Preferable terminology However,... Throughout the presentation I will say “classes” to keep the creation of the hierachy free from possible ontological issues about universals vs. attributive collections. 12 III. Creating the hierarchy of classes a) Why work with an Upper Ontology? Importing another ontology (e.g. an Upper Ontology) Basic Formal Ontology-a primer b) c) d) e) f) g) What is represented by the hierarchy? How to create a class hierarchy? Changing OWL class names Disjointness Exhaustiveness Rigidity 13 III.a Why work with an Upper Ontology Upper Ontologies… ...support consistency. ...foster harmonization and modularization. …help you to get into the right frame of mind. 14 III.a Why work with an Upper Ontology Some Upper Ontologies Basic Formal Ontology (BFO) Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE) Suggested Upper Merged Ontology (SUMO) 15 III.a Why work with an Upper Ontology How to import Upper Ontologies (or other pre-existent ontologies) into our ontology project? 16 Importing another ontology (e.g. an Upper Ontology) 17 18 19 BFO - Entities Continuant - a heart, the color of a tomato, the mass of a cloud Occurrent - the life of an organism, a surgical operation, a conversation a heart, a table, a collection of stones BFO - Continuants Dependent Continuant - the color of a tomato, the mass of a cloud Independent Continuant - a heart, a chair, the Northern Hemisphere of the Earth Spatial Region - Dimensions Zero - Three BFO - Dependent Continuants Generically Dependent Continuant - a PDF file, a musical score Specifically Dependent Continuant - the color of a tomato, the disposition of fish to decay, the role of being a doctor BFO - Specifically Dependent Continuants Quality - the color of a tomato, the mass of a cloud Realizable Entity Disposition - fragility, solubility Function - of the heart, to pump blood; of the a computer, to compute. Role - being a doctor, being pet, being student BFO - Independent Continuants Material Entity - a heart, a table, a collection of stones Object Boundary - end points of a line, the surface of the skin Site - A particular room, Maria’s nostril BFO - Material Entities Fiat Object Part - the upper lobe of the left lung, The Northern Hemisphere Object - a heart, a chair, a lung, an apple Object Aggregate - a collection of bacteria, a collection of stones BFO - Occurents Processual Entity - a conversation, the life of an organism Spatiotemporal Region - any part of space-time Temporal Region - any interval of time BFO - Processual Entities Fiat Process Part - The worst part of a rainstorm, the middle of a meal Proccess - sleeping, cell division, Process Aggregate - chewing gum and walking at the same time. Process Boundary - death Processual Context - a war is the context of battles BFO - Spatiotemporal Regions Connected Spatiotemporal Region spatiotemporal region of the life of an organism Spatiotemporal Instant - the spatiotemporal location of an instant of an organisms life Spatiotemporal Interval - the spatiotemporal region of the first year of an infant’s life Scattered Spatiotemporal Region - the spatiotemporal region occupied by all games of the World Cup. Temporal Regions have a parallel hierarchy. III.b What is represented by the hierarchy? Chordate Mammal Dog Labrador 29 III.b What is represented by the hierarchy? is_a relation The Protégé OWL Editor is based around a is_ahierarchy of the entities in a given domain. Start with thinking about your domain and its is_a-hierachy, draw a sketch. Make sure to use formal is_a relations (subclass relations) exclusively to build your hierarchy. 30 III.b What is represented by the hierarchy? Incorrect usage of is_a, Example 1 Travel Car Rental Cruise Liner Flight 31 III.a What is represented by the hierarchy? Incorrect usage of is_a, Example 2 Organism Bacteria Animal Other Organism Groupings 32 2. What is represented by the hierarchy? Subclass Relation (in OWL): A ⊆ B ⇔ ∀x ∈ A : x ∈ B Proper Subclass (in Set Theory): A⊂B⇔A⊆B∧A≠B 33 III.c How to create a class hierarchy? 34 III.c How to create a class hierarchy? A shortcut to create a hierarchy from a list You might find yourself in a situation where you have a list of terms referring to classes to be represented in your ontology. Example: 35 III.c How do we create a class hierarchy? 36 III.c How do we create a class hierarchy? 37 III.c How do we create a class hierarchy? 38 III.c How do we create a class hierarchy? 39 III.c How do we create a class hierarchy? 40 III.d Changing OWL class names Basically, there are two things that we need to keep separate: URI/IRI Class label 41 III.d Changing OWL class names URI/IRI (Uniform Resource Identifier/Internationalized Resource Identifier) The full URI consists of a locator and a name (e.g. the class name). Every URI is unique (➯ no classes with the same URI allowed within one ontology). Every class has exactly one URI. 42 III.d Changing OWL class names 43 III.d Changing OWL class names 44 III.d Changing OWL class names rdfs:label String value It is possible (though not advisable) to annotate two classes in one ontology with the same label. One class can have any number of labels (synonyms, terms in multiple languages). 45 III.d Changing OWL class names 46 III.d Changing OWL class names 47 III.d Changing OWL class names 48 III.e Disjointness No member of A is a member of B. A∩B=∅ A ∩ B ⇔ (∀x) (x ∈ A & x ∉ B) For universals or types: No instance of A is an instance of B. When building a taxonomy in OWL disjointness needs to be explicitly stated (Open World Assumption)! 49 III.e Disjointness 50 III.e Disjointness 51 III.e Disjointness 52 III.f Exhaustiveness At times we want to express that a certain level of a hierarchy is exhaustive, viz. the represented subclasses are all subclasses of their superclass. B & C are exhaustive of A ⇔ (∀x) x ∈ A & (x ∈ B ∨ x ∈ C) When building a taxonomy in OWL exhaustiveness needs to be explicitly stated (Open World Assumption)! 53 III.f Exhaustiveness 54 III.f Exhaustiveness 55 III.g Rigidity A universal is rigid if it is essential to its instances. An essential universal is one that necessarily holds for all of its instances. “Cat” is rigid; “Pet” is not. 56 III.f Rigidity Types vs. Roles Types are rigid sortals. Roles are non-rigid sortals. Sortals describe what sort of thing a concept represents. – e.g., “cat”, “milk”, and “doctor” are sortals. – e.g., “red”, “heavy”, and “singing” are not. – Sortals usually correspond to nouns. 57 III.f Rigidity 58 III.f Rigidity Rigidity tagging portion of matter -R drug +R antibiotic +R chemical compound +R oil -R nutriment (a source of material to nourish the body) 59 Resulting Hierarchies IV. Creating relations Not only universals, but also relations are represented in a hierarchy. 61 IV. Creating relations a) Adding relations (object properties) b) Domains & ranges c) Characteristics of relations 62 IV.a Adding relations (object properties) 63 IV.b Domains & ranges In OWL all relations are binary and are sets of ordered pairs. So in “aRb”, “a” is in the domain and “b” is in the range. 64 IV.b Domains & ranges 65 IV.c Characteristics of relations Functional Inverse functional Transitive Symmetric Asymmetric Reflexive Irreflexive 66 IV.c Characteristics of relations 67 IV.c Characteristics of relations Functional For xRy there is only one unique possible value of y. Formula: (a,b) ∈ R & (a,c) ∈ R ⇒ b=c Example: Matt hasBiologicalMother Bridget 68 IV.c Characteristics of relations Inverse functional For xRy there is only one unique possible value of x. Formula: (a,b) ∈ R & (c,b) ∈ R ⇒ a=c Example: Bridget isBiologicalMotherOf Matt 69 IV.c Characteristics of relations Transitive If xRy and yRz then xRz Formula: (a,b) ∈ R & (b,c) ∈ R ⇒ (a,c) ∈ R Example: Matt hasAncestor Rudolph Rudolph hasAncestor Francis ⇒ Matt hasAncestor Francis 70 IV.c Characteristics of relations Symmetric If xRy, then yRx. Formula: (a,b) ∈ R ⇒ (b,a) ∈ R Example: Matt hasSibling Chris ⇒ Chris hasSibling Matt 71 IV.c Characteristics of relations Asymmetric If xRy, then not (yRx). Formula: (a,b) ∈ R ⇒ (b,a) ∉ R Example: Chris isChildOf Bridget Bridget isChildOf Chris 72 IV.c Characteristics of relations Reflexive R relates x to itself. Formula: (a,a) ∈ R Example: Matt knows Matt 73 IV.c Characteristics of relations Irreflexive aRb is only if a ≠ b Formula: (a,b) R ⇔ a ≠ b Example: Chris isChildOf Bridget Chris isChildOf Chris 74 V. (Instances) In general, instances should not be represented by an domain ontology. However, Protégé enables representing instances (named individuals). For specific ontologies (e.g. application ontologies) it can be inevitable and necessary to specify individuals. 75 VI. Creating restrictions on classes a) General remarks b) Necessary vs. Necessary and Sufficient c) Types of restrictions 76 VI.a General remarks All restrictions we put on universal or attributive collection A are universal statements, viz. all instances of A are in the relation the restriction specifies. (∀x) (x ∈ A) & ((x, y) ∈ P) (The quantifier for y is to be specified by the formulation of the restriction.) 77 VI.a General remarks So, if I put the restriction “has_sibling human being” on the class “human being”, I am stating that ALL human beings have siblings. Which, of course, is wrong. 78 VI.b Necessary vs. Necessary and Sufficient Practical OWL formulation: Necessary: All members of the OWL class in question fulfill the condition specified by the restriction. Necessary and Sufficient: If there is an entity in the domain fulfilling the condition specified by the restriction it is a member of the OWL class in question. 79 VI.b Necessary vs. Necessary and Sufficient Examples: Necessary: If building dams is a necessary condition for an organism to be a beaver, all instances of beaver need to be building dams. Necessary and Sufficient: If biped and featherless is a necessary and sufficient condition for being human, every animal that is biped and featherless is human. 80 VI.b Necessary vs. Necessary and Sufficient 81 VI.c Types of restrictions Existential Restriction All members of A stand in the relation R to at least one member of B. (∀x) (∃y) (x ∈ A) & (y ∈ B) & ((x, y) ∈ R) 82 VI.c Types of restrictions 83 VI.c Types of restrictions Universal Restriction All members of A stand in the relation R only with members of B. (∀x) (∃y) (x ∈ A) & ((x, y) ∈ R) ⇒ (y ∈ B) 84 VI.c Types of restrictions 85 VI.c Types of restrictions Cardinality Restrictions Three types: Exact Cardinality, min Cardinality, max Cardinality Allow to specify that All x ∈ A are related to exactly n (y ∈ B) All x ∈ A are related to at least n (y ∈ B) All x ∈ A are related to at maximally n (y ∈ B) 86 VI.c Types of restrictions 87 VI.c Types of restrictions 88 VI.c Types of restrictions 89 Sources Bechhofer S, van Harmelen F, Hendler J, et al. (2004) OWL Web Ontology Language Reference. http://www.w3.org/TR/owl-ref/ Guarino N, Welty C (2004) An Overview of OntoClean. Handbook on Ontologies. Ed. Staab S, Studer R, pp.151-172. Hazewinkel M (2002) Encyclopedia of Mathematics. Berlin. http://eom.springer.de/default.htm Horrocks M (2009) A Practical Guide To Building OWL Ontologies Using Protégé 4 and CO-ODE Tools, Edition 1.2. University of Manchester, Manchester, UK. http://owl.cs.manchester.ac.uk/tutorials/protegeowltutorial/ Smith B, Kusnierczyk W, Schober D, Ceusters W (2006) Towards a reference terminology for ontology research and development in the biomedical domain. (2006). Proc. of KR-MED 2006. http://ontology.buffalo.edubfoTerminology_for_Ontologies.pdf Spear AD (2006) Ontology for the Twenty First Century: An Introduction with Recommendations. http://www.ifomis.org/bfo/documents/manual.pdf 90 Sources (page 2): • Smith B, Kusnierczyk W, Schober D, Ceusters W (2006) Towards a reference terminology for ontology research and development in the biomedical domain. (2006). Proc. of KRMED 2006. http://ontology.buffalo.edubfoTerminology_for_Ontologies.pdf • Spear AD (2006) Ontology for the Twenty First Century: An Introduction with Recommendations. http://www.ifomis.org/bfo/documents/manual.pdf • Hazewinkel M (2002) Encyclopedia of Mathematics. Berlin. http://eom.springer.de/default.htm 91