Introduction to Ontology - Department of Computing and Software

advertisement
Mohammed Alshayeb
March 3rd, 2011
Outlines
•
•
•
•
Theoretical Foundations of Ontologies
Principle for the Design of Ontologies
Ontology Language
Selection of Ontology Projects
What is Ontology?
•
Ontology: the branch of philosophy which deals with the nature and the
organization of reality
•
[Musen 1992, Gruber 1993]: Sharing common understanding of the structure of
information among people or software agents.
•
[A Gomez Perez al. 1999]: An ontology is a formal conceptualization of a domain
that is shared and reused across domains, tasks and group of people.
Conceptualization refers to an abstract model of some phenomenon in the world
by having identified the relevant concepts of that phenomenon.
•
An ontology is similar to a dictionary or glossary, but with greater detail and
structure that enables computers to process its content. An ontology consists of a
set of concepts, axioms, and relationships that describe a domain of interest [SUO
WG, IEEE]
– Ontology tries to answer two questions:
•
What is being?
•
What are the features common to all beings?
What is ontology?
•
In 1995, Guarino and Giaretta collected and analyzed the following seven
definitions:
1. Ontology as a philosophical discipline.
2. Ontology as an informal conceptual system.
3. Ontology as a formal semantic account.
4. Ontology as a specification of a conceptualization.
5. Ontology as a representation of a conceptual system via a logical
theory.
1.
2.
6.
7.
Characterized by specific formal properties.
Characterized only by its specific purposes.
Ontology as the vocabulary used by a logical theory.
Ontology as a meta-level specification of a logical theory.
Why ontology?
• To share common understanding of the
structure of information among people or
software agents.
• To enable reuse of domain knowledge.
• To make domain assumptions explicit.
• To separate domain knowledge from the
operational knowledge.
• To analyze domain knowledge.
http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html
Natalya F. Noy and Deborah L. McGuinness, Stanford University
Categorization of Ontologies
[Gomez-Preze]
• Domain Ontologies: models a specific domain. It represents the
particular meanings of terms as they apply to that domain ( e.g.
Gene Ontology, Web Semantic). The word Key has different
meaning:
• An ontology about the domain of network security.
• An ontology about the domain of buildings and rooms.
• Upper Ontologies or ( or foundation ontology): is a model of the
common objects that are generally applicable across a wide range
of domain ontologies.
• Knowledege Representation (KR) ontologies. This kind of ontology
is used to capture representation primitives used to formalize
knowledge under given KR paradigm.
• General ontologies. This ontology is used to represent common
sense knowledge reusable across domains e.g. Mereology ontolgy
ex. OWL
Main Components of an Ontology
• Classes: represent concepts, which are taken
in a broad sense.
• Attributes: describe the classes in the
ontology, e.g. Student has name.
• Relationships: make explicit the link between
classes in same domain. R ⊂C1 x C2 x C3 x ... x Cn
• Functions: are special case of relations in
which n-th element is unique Cn-1  Cn e.g.
pays
Formal Definition
• Ontology O = {C, R, A}
– C is a set whose element are called concepts.
– R ⊆C x C is a set whose elements are called relations.
For r =(c1,c1) ∈R, may write as r(c1)=c2
– A is a set of axioms on O. e.g. lexicon L={Lc,Lr,F, G}
• Lc is a set of elements called lexical entries of concepts.
• Lr is a set of elements called lexical entries of relations.
• F ⊂Lc x C is a reference for concepts | F(lc) ={c ∈C(lc, C) ∈ F}
for all lc ∈ Lc
• G ⊂Lr x R is a reference for relations | G(lr) ={r
∈R(lr, R) ∈ F} for all lr ∈ Lr
Example of Formal Ontology
• Suppose A = ∅, C={c1,c2}, Lc
={‘Mouse’,’input_device’}, Lr={‘is_a’}
– F(‘Mouse’)=c1, F(‘Input’)=c2 and G(‘is_a’)=r
R
r
C1 C2
C
O
F
L
Is_a
Graphical depiction of an
instantiated ontology
Lr
Mouse
input_d
evice
Lc
Example of Formal Ontology
• Conceptual Graphs (CG): is a logical formalism
that includes classes, relations, individuals and
quantifiers. e.g. cat on mat
Cat
ON
Mat
Simple conceptual graph in the graphical representation DF
Using textual notation Linear Form (LF) this sentence would be written as
[Cat]-(On)-[Mat]
Formal language CG Interchange Form (CGIF). In this language the sentence would be expressed as
[Cat: *x] [Mat: *y] (On ?x ?y)
where *x is a variable definition and ?x is a reference to the defined variable. Using syntactical
shortcuts, the same sentence could be also written in the same language as
(On [Cat] [Mat])
The conversion between the three languages is defined as well as direct conversion between CGIF and KIF
(Knowledge Interchange Format). In the KIF language this example would be expressed as
(exists ((?x Cat) (?y Mat)) (On ?x ?y))
All these forms have the same semantics in the predicate logic:
∃ x,y: Cat(x) ∧ Mat(x) ∧ on(x,y)
Categorization of Ontology
Top-level ontology
Domain Ontology
Task ontology
Application ontology
Guarino (1998) categorization
•
•
•
Top-level ontologies describe very general concepts like space, Time, etc which are independent of a
particular problem or domain.
Domain ontologies and task ontologies describe, respectively, the vocabulary related to a generic
domain (like medicine, or automobiles) or generic task or activity (like selling) by specializing the terms
introduced in the top-level ontology.
Application ontologies describe concepts depending both on a particular domain and task, which are
often specializations of both the related ontologies.
Categorization of ontology
Controlled
Vocabularies
Terms/glo
ssary
Narrower
term relation
Informal isa
Formals-a
Formal
instance
Frames
Value
Restrs.
General
Logical
constrain
Disjointn
ess,
inverse
Lassila and McGuinnes (2001) categorization
• Lassila and McGunness (2001) classified different types of lightweight and heavyweight ontologies.
• Controlled vocabularies: finite list of terms.
• Glossary that is a list of terms with their meanings specified as natural language statements
• Thesauri, which provide some additional semantics between terms.
• Informal is-a hierarchies, taken from specifications of term hierarchies.
• Formal is-a hierarchies. if B is a subclass of A and an object is instance of B then object is an instance of A
• Formal is-a hierarchies that include instances of the domain.
• Frames the ontology includes classes and their properties.
• Value restriction that place restrictions on the values that can fill a property. Date = arrival date
• General Logical constraints: used First order logic constraints between terms and ontology language.
Top Level ontologies
Object
Event
Sequence
Element
property
ALL
Configuration
UM - Thing
Top Level Ontology
Hierarchy of top-level categories Sowa’s lattice of categories
Knowledge Representation
Ontology
Logic
computation
Sowa KR components: Logical, Philosophical and Computational
Ontological Commitments
• Ontological commitments is an agreements to use the shared vocabulary
in coherent and consistent manner. [Guarino 1998]
• Ontological commitments guarantee consistency but not completeness of
ontology.
Apple
Co.
Apple
Apple
Fruit
Tree
Apple
Define-class Apple(?x)
“Apple Company”
:axiom-def:
(and (Subclass-of MacPro ))
(Template-Facet-Value Serial# )
Example Description Logics (DL)
All agree on the Shape of apple
Ontologies and intended meaning
Conceptualization C
Commitment K = <C,R>
Language L
Models MD(L)
Intended models I
Ontology
Ontology models IK(L)
Gennaio 2006
Ontology Quality
Good
High precision, max coverage
BAD
Max precision, limited coverage
Gennaio 2006
Less good
Low precision, max coverage
WORSE
Low precision, limited coverage
Ontology Quality
MD(L)
Area
of false
agreement!
IA(L)
Gennaio 2006
IB(L)
Principle for the Design of Ontologies
1.
2.
3.
4.
5.
Clarity: An ontology should effectively communicate the intended
meaning of defined terms. Definitions should be objective.
Coherence: An ontology should be coherent: that is, it should
sanction inferences that are consistent with the definitions.
Extendibility: An ontology should be designed to anticipate the
uses of the shared vocabulary.
Minimal encoding bias: The conceptualization should be specified
at the knowledge level without depending on a particular symbollevel encoding.
Minimal ontological commitment: An ontology should require
the minimal ontological commitment sufficient to support the
intended knowledge sharing activities.
Ontology Language
• Ontology languages boomed in the early of 1990s. The
first ontology language ever created is CycL.
http://www.cyc.com/
– KIF: is a general knowledge interchange format language
– Ontolingual: is a standard ontology language in 1990s
created on top of KIF.
– LOOM: is a language targeted for general knowledge
bases.
– OCML: is a language built on top Ontolingual with added
executable abilities.
– Flogic: is a language that combines frame and first order
logic.
Ontology Language
• The Resource Description (RDF) Framework: is
a language for representing information about
resources in the World Wide Web.
Ontology Language
• Knowledge Interchange Format: is a language
designed to be used for exchange of knowledge
between different systems.
• For example, KIF definition expressing that a rail
vehicle is a vehicle designed to move on railways
is written as:
(subclass RailVehicle LandVehicle)
(documentation RailVehicle
"A Vehicle designed to move on &%Railways.")
(=> (instance ?X RailVehicle)
(hasPurpose ?X
(exists (?EV ?SURF)
(and (instance ?RAIL Railway)
(instance ?EV Transportation)
(holdsDuring (WhenFn ?EV)
(meetsSpatially ?X ?RAIL))))))
Ontology Language
• Description logics (DL) are logics serving
primarily for formal description of concepts
and roles (relations).
Defrelation Pays
:is
(:function (?room ?discount)
(- (Price ?room) (/(*Price ?room))
?Discount) 100)))
:domains (Room Number)
:range Number)
DL presented in LOOM
Methodologies for Building Ontologies
• Ontology Development Process: it is
advisable to carry out in three categories of
activities: • Scheduling, control and quality assurance
Management
Development
oriented activities
Support activities
• Pro-development, development, and postdevelopment
• Series of activities performed at the same
time as the development-oriented activities
Methodologies for Building Ontologies
Building
• The Cyc method: it is a hybrid language that combines
frames with predicate calculus.
coding,
Evaluation and
• Uschold nd King’s method Identify purpose Capture,
integrating
Documentation
• Gruniger and Fox’s methodology: Identify motivating
scenarios, elaborate informal competency questions,
specify the terminology using first order logic, write
competency questions in a formal ways, specify
completeness theorems.
Methodologies for Building Ontologies
On-To-Knowledge processes (Staab et al., 2001) (© 2001 IEEE)
• Feasibility study: Identify problem and opportunity areas, select most promising
focus area and target soultion
• Ontology Kickoff: Requirement specification, Analyze input sources, develop
baseline taxonomy.
• Refinement: Concept elicitation with domain experts, develop baseline taxonomy,
conceptualize and formlize, add relations and axioms.
• Evaluation: Identify problem and opportunity areas, select most promising focus
area and target solution.
• Maintenance: Manage organizational maintenance process.
Selection of Ontology Projects
• Progate: suite of tools to construct domain
models and knowledge-based applications with
ontologies.
• SEMANTIC MINING (FP6 - NoE): Semantic
Interoperability and Data Mining in Biomedicine.
• The Gene Ontology (GO) project: is a
collaborative effort to address the need for
consistent descriptions of gene products in
different databases.
[http://protege.stanford.edu/]
Selection of Ontology Projects
• The Gene Ontology (GO) project: has developed
three structured controlled ontologies:
– associated biological processes
– cellular components
– molecular functions
• The ontology covers three domains:
– cellular component
– molecular function
– biological process
Selection of Ontology Projects
• GO Ontology relations
– The GO are structured as a graph, with terms as
nodes in the graph and the relations between the
terms. These comprise is a (is a subtype of); part
of; and regulates, negatively regulates and
positively regulates.
 A is a B
 B is part of C
 we can infer that A is part of C
The formal notation of the inference made in the graph above would be:
is a  part of → part of
Selection of Ontology Projects
• The is a relation
– The is a relation in GO is very simple: if we say A is
a B, we mean that node A is a subtype of node B.
• Reasoning over is a
– is a is  a → is a
– The is a relation is transitive, which means that if
A is a B, and B is a C, we can infer that A is a C.
Selection of Ontology Projects
• The part of relation
– The relation part of is used to represent part-whole
relationships in the Gene Ontology. part of has a specific
meaning in GO, and a part of relation would only be added
between A and B if B is necessarily part of A: wherever B
exists, it is as part of A, and the presence of the B implies
the presence of A
http://www.geneontology.org/GO.ontology.relations.shtml
Questions?
Refereneces
•
•
•
•
Ontologies – Introduction, J. Johannes Pretorious, Vrije University.
Formal Ontology and Information System, Nicola Guarino.
Ontological Engineering, Asuncion Gomez-Perez, Mariano Fernande-Lopez, Oscar Orcho.
http://www.geneontology.org/GO.doc.shtml
Download