Mohammed Alshayeb March 3rd, 2011 Outlines • • • • Theoretical Foundations of Ontologies Principle for the Design of Ontologies Ontology Language Selection of Ontology Projects What is Ontology? • Ontology: the branch of philosophy which deals with the nature and the organization of reality • [Musen 1992, Gruber 1993]: Sharing common understanding of the structure of information among people or software agents. • [A Gomez Perez al. 1999]: An ontology is a formal conceptualization of a domain that is shared and reused across domains, tasks and group of people. Conceptualization refers to an abstract model of some phenomenon in the world by having identified the relevant concepts of that phenomenon. • An ontology is similar to a dictionary or glossary, but with greater detail and structure that enables computers to process its content. An ontology consists of a set of concepts, axioms, and relationships that describe a domain of interest [SUO WG, IEEE] – Ontology tries to answer two questions: • What is being? • What are the features common to all beings? What is ontology? • In 1995, Guarino and Giaretta collected and analyzed the following seven definitions: 1. Ontology as a philosophical discipline. 2. Ontology as an informal conceptual system. 3. Ontology as a formal semantic account. 4. Ontology as a specification of a conceptualization. 5. Ontology as a representation of a conceptual system via a logical theory. 1. 2. 6. 7. Characterized by specific formal properties. Characterized only by its specific purposes. Ontology as the vocabulary used by a logical theory. Ontology as a meta-level specification of a logical theory. Why ontology? • To share common understanding of the structure of information among people or software agents. • To enable reuse of domain knowledge. • To make domain assumptions explicit. • To separate domain knowledge from the operational knowledge. • To analyze domain knowledge. http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html Natalya F. Noy and Deborah L. McGuinness, Stanford University Categorization of Ontologies [Gomez-Preze] • Domain Ontologies: models a specific domain. It represents the particular meanings of terms as they apply to that domain ( e.g. Gene Ontology, Web Semantic). The word Key has different meaning: • An ontology about the domain of network security. • An ontology about the domain of buildings and rooms. • Upper Ontologies or ( or foundation ontology): is a model of the common objects that are generally applicable across a wide range of domain ontologies. • Knowledege Representation (KR) ontologies. This kind of ontology is used to capture representation primitives used to formalize knowledge under given KR paradigm. • General ontologies. This ontology is used to represent common sense knowledge reusable across domains e.g. Mereology ontolgy ex. OWL Main Components of an Ontology • Classes: represent concepts, which are taken in a broad sense. • Attributes: describe the classes in the ontology, e.g. Student has name. • Relationships: make explicit the link between classes in same domain. R ⊂C1 x C2 x C3 x ... x Cn • Functions: are special case of relations in which n-th element is unique Cn-1 Cn e.g. pays Formal Definition • Ontology O = {C, R, A} – C is a set whose element are called concepts. – R ⊆C x C is a set whose elements are called relations. For r =(c1,c1) ∈R, may write as r(c1)=c2 – A is a set of axioms on O. e.g. lexicon L={Lc,Lr,F, G} • Lc is a set of elements called lexical entries of concepts. • Lr is a set of elements called lexical entries of relations. • F ⊂Lc x C is a reference for concepts | F(lc) ={c ∈C(lc, C) ∈ F} for all lc ∈ Lc • G ⊂Lr x R is a reference for relations | G(lr) ={r ∈R(lr, R) ∈ F} for all lr ∈ Lr Example of Formal Ontology • Suppose A = ∅, C={c1,c2}, Lc ={‘Mouse’,’input_device’}, Lr={‘is_a’} – F(‘Mouse’)=c1, F(‘Input’)=c2 and G(‘is_a’)=r R r C1 C2 C O F L Is_a Graphical depiction of an instantiated ontology Lr Mouse input_d evice Lc Example of Formal Ontology • Conceptual Graphs (CG): is a logical formalism that includes classes, relations, individuals and quantifiers. e.g. cat on mat Cat ON Mat Simple conceptual graph in the graphical representation DF Using textual notation Linear Form (LF) this sentence would be written as [Cat]-(On)-[Mat] Formal language CG Interchange Form (CGIF). In this language the sentence would be expressed as [Cat: *x] [Mat: *y] (On ?x ?y) where *x is a variable definition and ?x is a reference to the defined variable. Using syntactical shortcuts, the same sentence could be also written in the same language as (On [Cat] [Mat]) The conversion between the three languages is defined as well as direct conversion between CGIF and KIF (Knowledge Interchange Format). In the KIF language this example would be expressed as (exists ((?x Cat) (?y Mat)) (On ?x ?y)) All these forms have the same semantics in the predicate logic: ∃ x,y: Cat(x) ∧ Mat(x) ∧ on(x,y) Categorization of Ontology Top-level ontology Domain Ontology Task ontology Application ontology Guarino (1998) categorization • • • Top-level ontologies describe very general concepts like space, Time, etc which are independent of a particular problem or domain. Domain ontologies and task ontologies describe, respectively, the vocabulary related to a generic domain (like medicine, or automobiles) or generic task or activity (like selling) by specializing the terms introduced in the top-level ontology. Application ontologies describe concepts depending both on a particular domain and task, which are often specializations of both the related ontologies. Categorization of ontology Controlled Vocabularies Terms/glo ssary Narrower term relation Informal isa Formals-a Formal instance Frames Value Restrs. General Logical constrain Disjointn ess, inverse Lassila and McGuinnes (2001) categorization • Lassila and McGunness (2001) classified different types of lightweight and heavyweight ontologies. • Controlled vocabularies: finite list of terms. • Glossary that is a list of terms with their meanings specified as natural language statements • Thesauri, which provide some additional semantics between terms. • Informal is-a hierarchies, taken from specifications of term hierarchies. • Formal is-a hierarchies. if B is a subclass of A and an object is instance of B then object is an instance of A • Formal is-a hierarchies that include instances of the domain. • Frames the ontology includes classes and their properties. • Value restriction that place restrictions on the values that can fill a property. Date = arrival date • General Logical constraints: used First order logic constraints between terms and ontology language. Top Level ontologies Object Event Sequence Element property ALL Configuration UM - Thing Top Level Ontology Hierarchy of top-level categories Sowa’s lattice of categories Knowledge Representation Ontology Logic computation Sowa KR components: Logical, Philosophical and Computational Ontological Commitments • Ontological commitments is an agreements to use the shared vocabulary in coherent and consistent manner. [Guarino 1998] • Ontological commitments guarantee consistency but not completeness of ontology. Apple Co. Apple Apple Fruit Tree Apple Define-class Apple(?x) “Apple Company” :axiom-def: (and (Subclass-of MacPro )) (Template-Facet-Value Serial# ) Example Description Logics (DL) All agree on the Shape of apple Ontologies and intended meaning Conceptualization C Commitment K = <C,R> Language L Models MD(L) Intended models I Ontology Ontology models IK(L) Gennaio 2006 Ontology Quality Good High precision, max coverage BAD Max precision, limited coverage Gennaio 2006 Less good Low precision, max coverage WORSE Low precision, limited coverage Ontology Quality MD(L) Area of false agreement! IA(L) Gennaio 2006 IB(L) Principle for the Design of Ontologies 1. 2. 3. 4. 5. Clarity: An ontology should effectively communicate the intended meaning of defined terms. Definitions should be objective. Coherence: An ontology should be coherent: that is, it should sanction inferences that are consistent with the definitions. Extendibility: An ontology should be designed to anticipate the uses of the shared vocabulary. Minimal encoding bias: The conceptualization should be specified at the knowledge level without depending on a particular symbollevel encoding. Minimal ontological commitment: An ontology should require the minimal ontological commitment sufficient to support the intended knowledge sharing activities. Ontology Language • Ontology languages boomed in the early of 1990s. The first ontology language ever created is CycL. http://www.cyc.com/ – KIF: is a general knowledge interchange format language – Ontolingual: is a standard ontology language in 1990s created on top of KIF. – LOOM: is a language targeted for general knowledge bases. – OCML: is a language built on top Ontolingual with added executable abilities. – Flogic: is a language that combines frame and first order logic. Ontology Language • The Resource Description (RDF) Framework: is a language for representing information about resources in the World Wide Web. Ontology Language • Knowledge Interchange Format: is a language designed to be used for exchange of knowledge between different systems. • For example, KIF definition expressing that a rail vehicle is a vehicle designed to move on railways is written as: (subclass RailVehicle LandVehicle) (documentation RailVehicle "A Vehicle designed to move on &%Railways.") (=> (instance ?X RailVehicle) (hasPurpose ?X (exists (?EV ?SURF) (and (instance ?RAIL Railway) (instance ?EV Transportation) (holdsDuring (WhenFn ?EV) (meetsSpatially ?X ?RAIL)))))) Ontology Language • Description logics (DL) are logics serving primarily for formal description of concepts and roles (relations). Defrelation Pays :is (:function (?room ?discount) (- (Price ?room) (/(*Price ?room)) ?Discount) 100))) :domains (Room Number) :range Number) DL presented in LOOM Methodologies for Building Ontologies • Ontology Development Process: it is advisable to carry out in three categories of activities: • Scheduling, control and quality assurance Management Development oriented activities Support activities • Pro-development, development, and postdevelopment • Series of activities performed at the same time as the development-oriented activities Methodologies for Building Ontologies Building • The Cyc method: it is a hybrid language that combines frames with predicate calculus. coding, Evaluation and • Uschold nd King’s method Identify purpose Capture, integrating Documentation • Gruniger and Fox’s methodology: Identify motivating scenarios, elaborate informal competency questions, specify the terminology using first order logic, write competency questions in a formal ways, specify completeness theorems. Methodologies for Building Ontologies On-To-Knowledge processes (Staab et al., 2001) (© 2001 IEEE) • Feasibility study: Identify problem and opportunity areas, select most promising focus area and target soultion • Ontology Kickoff: Requirement specification, Analyze input sources, develop baseline taxonomy. • Refinement: Concept elicitation with domain experts, develop baseline taxonomy, conceptualize and formlize, add relations and axioms. • Evaluation: Identify problem and opportunity areas, select most promising focus area and target solution. • Maintenance: Manage organizational maintenance process. Selection of Ontology Projects • Progate: suite of tools to construct domain models and knowledge-based applications with ontologies. • SEMANTIC MINING (FP6 - NoE): Semantic Interoperability and Data Mining in Biomedicine. • The Gene Ontology (GO) project: is a collaborative effort to address the need for consistent descriptions of gene products in different databases. [http://protege.stanford.edu/] Selection of Ontology Projects • The Gene Ontology (GO) project: has developed three structured controlled ontologies: – associated biological processes – cellular components – molecular functions • The ontology covers three domains: – cellular component – molecular function – biological process Selection of Ontology Projects • GO Ontology relations – The GO are structured as a graph, with terms as nodes in the graph and the relations between the terms. These comprise is a (is a subtype of); part of; and regulates, negatively regulates and positively regulates. A is a B B is part of C we can infer that A is part of C The formal notation of the inference made in the graph above would be: is a part of → part of Selection of Ontology Projects • The is a relation – The is a relation in GO is very simple: if we say A is a B, we mean that node A is a subtype of node B. • Reasoning over is a – is a is a → is a – The is a relation is transitive, which means that if A is a B, and B is a C, we can infer that A is a C. Selection of Ontology Projects • The part of relation – The relation part of is used to represent part-whole relationships in the Gene Ontology. part of has a specific meaning in GO, and a part of relation would only be added between A and B if B is necessarily part of A: wherever B exists, it is as part of A, and the presence of the B implies the presence of A http://www.geneontology.org/GO.ontology.relations.shtml Questions? Refereneces • • • • Ontologies – Introduction, J. Johannes Pretorious, Vrije University. Formal Ontology and Information System, Nicola Guarino. Ontological Engineering, Asuncion Gomez-Perez, Mariano Fernande-Lopez, Oscar Orcho. http://www.geneontology.org/GO.doc.shtml