Ontologies Angela Maduko Acknowledgements Ontology Development 101: A Guide to Creating Your First Ontology Borrowed slides from – http://www.ifnet.it/elag2002/papers/pap9.ppt – http://www.ccs.neu.edu/home/kenb/pub/2003 /02/public.ppt Outline Background on ontologies Ontology development – – – – Requirements and Analysis Design and Implementation Testing and Validation Maintenance What is an ontology? Most of available definitions are of little help to the layman: “An ontology is an explicit specification of a conceptualization” Thomas R. Gruber “An ontology is a logical theory accounting for the intended meaning of a formal vocabulary” Nicola Guarino “A theory concerning the kinds of entities and specifically the kinds of abstract entities that are to be admitted to a language system” Webster’s Third New International Dictionary What is an Ontology? Some definitions sound a bit more familiar: “An ontology is a formal explicit description of concepts in a domain of discourse (classes […]), properties of each concept […] (slots […]), and restrictions on slots (facets […])” Natalya F. Noy & Deborah L. McGuinness “An ontology is a formal definition of a body of knowledge” James Hendler “An ontology is a catalog of the types of things that are assumed to exist in a domain of interest D from the perspective of a person who uses a language L for the purpose of talking about D” John F. Sowa What general features can we draw from all of these definitions? A given field/domain of interest within the boundaries of which “things” are assumed to exist, to “be” and about which we wish to “talk” in a formalized, explicit, and semantically consistent way To talk about what is: Ontologies Defines a common vocabulary for researchers who need to share information in a domain. It includes machine-interpretable definitions of basic concepts in the domain and relations among them. Ontologies Ontology: What exists in a domain and how they relate with each other. Formal ontology: Formal treatment of the concepts and relationships in a domain. Simple Example: – Employees work for Companies – Employees report to Employees Statements George is an employee. George works for Sony. George reports to Adam. Fred works for a company. An object is an instance of the Employee class. An object in the Employee class is linked with an object in the Company class via the works_for relationship. An object in the Employee class is linked with another object in the same class via the reports_to relationship. Fred reports to two other employees. Fred must report to another employee. George says that Fred works for Toyota. Purposes of Ontologies Basis for communication – Between people (may be informal) – Between agents (formal ontologies) Some Applications – Representing and storing data (e.g., DB schema) – Knowledge sharing within and between domains – Search and retrieval – Classification and organization of data resources Logic: Open versus Closed Open (monotonic) logic gives different answers to queries than a closed logic. works for Em ployee 0..* Com pany 1 Suppose that Fred is just an employee: – Closed world: violates constraint – Open world: Fred works for a company, but the company is not known. Logic: Open versus Closed works for Em ployee 0..* Com pany 1 Suppose that Fred works for both Sony and IBM: – Closed world: violates constraint – Open world: Sony and IBM are the same company! Outline Background on ontologies Ontology development – – – – Requirements and Analysis Design and Implementation Testing and Validation Maintenance Why Develop an Ontology To share common understanding of the structure of information among people or software agents To enable reuse of domain knowledge To make domain assumptions explicit To separate domain knowledge from the operational knowledge To analyze domain knowledge Ontology Development Phases of Ontology Development – Requirements and Analysis Determine domain and scope of ontology What is the domain that the ontology will cover? For what are we going to use the ontology? To what types of questions will the information in the ontology provide answers? – Design and Implementation – Testing and Validation – Maintenance Who will use and maintain the ontology? Requirements and Analysis Least understood of all phases Direct involvement by stakeholders is essential, but how? Specifying the scope is important, but not all languages support it. Point of view is also relevant. These phases offer significant opportunities for new methodologies and processes. Ontology Design Tasks involved in designing an ontology include – Defining classes in the ontology – Arranging the classes in a taxonomic (subclass–superclass) hierarchy – Defining slots (relationships) and describing allowed values for these slots – Filling in the values for slots for instances. Fundamental Rules in Ontology Design There is no one correct way to model a domain— there are always viable alternatives. The best solution almost always depends on the application that you have in mind and the extensions that you anticipate. Ontology development is necessarily an iterative process. Concepts in the ontology should be close to objects (physical or logical) and relationships in your domain of interest. These are most likely to be nouns (objects) or verbs (relationships) in sentences that describe your domain. Example Which wine characteristics should I consider when choosing a wine? Is Bordeaux a red or white wine? Does Cabernet Sauvignon go well with seafood? What is the best choice of wine for grilled meat? Which characteristics of a wine affect its appropriateness for a dish? Does a bouquet or body of a specific wine change with vintage year? What were good vintages for Napa Zinfandel? Step 3. Enumerate important terms in the ontology What are the terms we would like to talk about? What properties do those terms have? What would we like to say about those terms? Example – – – – wine, grape, winery, location, a wine’s color, body, flavor and sugar content; different types of food, such as fish and red meat; subtypes of wine such as white wine, and so Step 4. Define the classes and the class hierarchy Methods – Top-down – Bottom-up – Combination Step 4. Define the classes and the class hierarchy Hierarchical arrangements of concepts If a class A is a superclass of class B, then every instance of B is also an instance of A This implies “In other words, the class B represents a concept that is a “kind of” A.” Step 4. Define the classes and the class hierarchy Top-down – Food and Wine -- followed by White, Blush and Red Bottom-up – define specific wine class first and then work your way up Combination Step 5. Define the properties of classes— slots Types of properties – “intrinsic” properties such as the flavor of a wine; – “extrinsic” properties such as a wine’s name, and area it comes from; – parts, if the object is structured; these can be both physical and abstract “parts” (e.g., the courses of a meal) – relationships to other individuals; these are the relationships between individual members of the class and other items (e.g., the maker of a wine, representing a relationship between a wine and a winery, and the grape the wine is made from.) Step 5. Define the properties of classes— slots Examples – a wine’s color, body, flavor sugar content – location of a winery. Step 6. Define the facets of the slots Slot cardinality – defines how many values a slot can have. Slot-value type – what types of values can fill in the slot. – common value types: String Number Boolean Enumerated Step 6. Define the facets of the slots value type allowed values the number of the values (cardinality) other features of the values the slot can take. Step 6 - rules slot Domain and Ranges When defining a domain or a range for a slot, find the most general classes or class that can be respectively the domain or the range for the slots . On the other hand, do not define a domain and range that is overly general: all the classes in the domain of a slot should be described by the slot and instances of all the classes in the range of a slot should be potential fillers for the slot. Do not choose an overly general class for range (i.e., one would not want to make the range THING) but one would want to choose a class that will cover all fillers Step 6 - rules slot Domain and Ranges If a list of classes defining a range or a domain of a slot – includes a class and its subclass, remove the subclass. – contains all subclasses of a class A, but not the class A itself, the range should contain only the class A and not the subclasses. – contains all but a few subclasses of a class A, consider if the class A would make a more appropriate range definition. Step 7 - Creating Instances Body: Light Color: Red Flavor: Delicate Tannin level: Low Grape: Gamay (instance of the Wine grape class) Maker: Chateau-Morgon (instance of the Winery class) Region: Beaujolais (instance of the Wine-Region class) Sugar: Dry Step 8 - Defining classes and a class hierarchy Ensuring that the class hierarchy is correct – An “is-a” relation A subclass of a class represents a concept that is a “kind of” the concept that the superclass represents. – A single wine is not a subclass of all wines – Transitivity of the hierarchical relations Step 8 - Defining classes and a class hierarchy Evolution of a class hierarchy – distinction between classes and their names – hence synonyms of concept name do not represent different classes – Avoid class hierarchy cycles – All the siblings in the hierarchy (except for the ones at the root) must be at the same level of generality. Step 8 - Defining classes and a class hierarchy How many is too many and how few are too few? – If a class has only one direct subclass there may be a modeling problem or the ontology is not complete. – If there are more than a dozen subclasses for a given class then additional intermediate categories may be necessary. (may not always be possible) Step 8 - Defining classes and a class hierarchy Multiple Inheritance When do you introduce a new class – Subclasses of a class usually (1) have additional properties that the superclass does not have, or (2) restrictions different from those of the superclass, or (3) participate in different relationships than the superclasses Step 8 - Defining classes and a class hierarchy A new class or a property value? – Class White Wine or simply property of class Wine that takes value White – Depends on how important a concept White Wine is in the domain An instance or a class? – starts with deciding what is the lowest level of granularity in the representation. – Individual instances are the most specific concepts represented in a knowledge base. Step 8 - Limiting the scope – The ontology should not contain all the possible information about the domain: you do not need to specialize (or generalize) more than you need for your application (at most one extra level each way). – The ontology should not contain all the possible properties of and distinctions among classes in the hierarchy. Design and Implementation: Patterns and Reuse Reuse existing ontologies if they exist. Transform another ontology Alignment: inter-ontological relationships Composition: Transform and combine many smaller ontologies Combined Ontology Ontology1 Ontology2 Common Features Design and Implementation: Refactoring Move methods from one class to another (for example, from subclasses to superclasses). Reification and unreification: changing relationships into classes or vice versa. Refactoring is useful at the design, implementation and maintenance phases. Example of Refactoring Example of Reification Testing and Validation Validation of requirements Consistency checking – incompatible domain and range definitions for transitive, symmetric, or inverse properties – cardinality properties – requirements on property values can conflict with domain and range restrictions In order to better understand what an ontology is, let’s make one. What is the specific goal of our ontology? To describe the process of making tea, in order to have it automated? To evaluate the quality of a specific cup of tea? To evaluate the skills of an apprentice tea-maker? To choose the best type of cup of tea according to time and meal? To allow tea-houses to share differently structured data? &c… The scope of an ontology must be precisely defined What is required in order to make a cup of tea? Ingredients: tea, water, milk or lemon, sugar… Utensils: kettle, tea-pot, tea strainer, cup, spoon… Actions: boiling the water, pouring the boiling water into the tea-pot, pouring the water out of the teapot, simmering some fresh water… Measurements: temperature of water, duration of the infusing process… &c… All of these are the “things” assumed to exist in our domain of interest: “the making of a cup of tea” Is there a hierarchy among these “things”? Ingredients Utensils Actions Measurements Boiling Duration of Required Optional Tea-pot Temperature Kettle the water the infusing ingredients ingredients etc. of water etc. process Tea Water Milk Lemon Sugar Are we sure all of us understand these “things” the same way? What is an “ingredient”? Please define: … What is a “required ingredient”? Please define: … What is an “optional ingredient”? Please define: … How can we describe each of these “things”? origin: beet? maple? cane? Sugar colour: brown? white? form: caster? lumps? liquid? candy? granulated? … What about fructose and Aspartam? Let’s create new classes and sub-classes: Sweeteners Natural sweeteners Sugar origin: … colour: … form: … Diet sweeteners Fructose Artificial sweeteners Aspartam The development of an ontology is an iterative process: an ontology needs to be evaluated, revised, evaluated, revised, evaluated… Do we really need all these refinements about sweeteners? Not necessarily: depends on our purposes Perhaps we might be just as happy with: Sweeteners type: cane sugar, beet sugar, fructose, Aspartam… form: lumps, granulated… “weightwatchingcally correct?”: Yes or No “There is no one correct way to model a domain — there are always viable alternatives” (N. F. Noy & D. L. McGuinness) How are these “things” interrelated? Water is passive participant in Boiling has passive participant the water has as tool is tool of Kettle Are we sure all of us understand these relationships the same way? • What does “is passive participant in / has passive participant” mean? Please define: … • What does “has as tool / is tool of” mean? Please define: … Did you know you knew so many things about a simple cup of tea? Ontologies are related to the concept of knowledge representation