+ Ontology Engineering in UML: Elisa Kendall

advertisement
+
Ontology Engineering in UML:
Best Practices & Lessons Learned of a Working Ontologist
Elisa Kendall
Thematix Partners LLC
OMG Technical Meeting, 19 March, 2013
+ Content management
Mars was photographed by the Hubble Space Telescope in August 2003 as
the planet passed closer to Earth than it had in nearly 60,000 years.
Image Credit: NASA, J. Bell (Cornell U.) and M. Wolff (SSI)
2
A sunset on Mars creates a glow due to the
presence of tiny dust particles in the
atmosphere. This photo is a combination of
four images taken by Mars Pathfinder, which
landed on Mars in 1997. Image credit:
NASA/JPL
Recent images from instruments on board the
Mars Reconnaissance Orbiter take much more
detailed, narrower views of specific features of the
Martian surface. Image credit: NASA/JPL
The Planetary Data Store (PDS) is a distributed repository of 40+ years’ imagery & data
taken by a range of instruments on many diverse missions, available for scientific
research.
Copyright © 2013 Thematix Partners LLC
2
+ Smart search
3
Provenance/sources for tracking family members in the 19th
century include early census data (often error prone), military
records, passenger & immigration lists, online documents (e.g.,
county histories, church histories, etc.)
• Historical/forensic research requires cross-domain search of a wide variety of resources
within a given geo-spatial/temporal context
• Similar capabilities are essential for business intelligence, law enforcement, government
applications – all require terminology reconciliation
Copyright © 2013 Thematix Partners LLC
3
+ Merchandising
4
“ComfortOriented”
Flight
Copyright © 2013 Thematix Partners LLC
Attribute 1
Attribute 2
Leg Room
Attribute 4
Attribute 5
Attribute 5
On-Off Access
Attribute 7
Attribute 8
Attribute 9
Media Options
Attribute 11
Attribute 12
Meal Options
Attribute 14
Attribute 15
Time of Day
Attribute 17
Attribute 18
Distance to Gate
Attribute 20
Attribute 21
Flight Characteristics
Attribute N
…
4
+ Search Engine Optimization
5

Use search engines as a front door to transaction

Dramatically increase relevance by inclusion of more
meaningful, synonymic tags

Enrich search results, adding ratings, video, prices,
avails, amenities, locality etc.
Copyright © 2013 Thematix Partners LLC
5
+ Tutorial Outline
6

Introduction to Knowledge Representation

OWL Basics & Using UML /ODM for Modeling OWL
Ontologies

Putting it all together
Copyright © 2013 Thematix Partners LLC
6
+ Part 1: Introduction to Knowledge Representation
7

A little background and a few definitions

Layers of abstraction & conceptual modeling

Classifying ontologies

A little methodology
Copyright © 2013 Thematix Partners LLC
7
+ Historical Context

Knowledge Representation

Cross-disciplinary field with historical roots in philosophy, linguistics,
computer science, and cognitive science

Goal is to represent the meaning of knowledge unambiguously, so that it can
be understood, shared, and used by computational agents acting on behalf of
people to accomplish some task

Plato and Aristotle at the School of
Athens, by Raphael
Copyright © 2013 Thematix Partners LLC
Philosophical origins

Socrates questioning, Plato’s studies of epistemology –
the nature of knowledge

Aristotle’s shift to terminology, development of logic as
a precise method for reasoning about knowledge

Arguments for the existence of God dating back to
Anselm of Canterbury

Medieval theories of reference and of mental language,
Scholastic logic
8
8
+ Historical Context

“Brain Cells for Grandmother”






Neuroscientists continue to debate today how we store
memories
One theory, documented as recently as this month in a
Scientific American article, suggests that single neurons
hold our memories, as concept cells
“Each concept – each person or thing in everyday
experience – may have a set of corresponding neurons
assigned to it”
 Rodrigo Quian Quiroga, Itzhak Fried and Christof
Koch, February 2013 issue, Scientific American
“…a relatively few neurons, numbering in the thousands
or perhaps even less, constitute a “sparse”
representation of an image.”
“ Our brain may use a small number of concept cells to
represent many instances of one thing as a unique
concept – a sparse and invariant representation … What
is important is to grasp the gist of particular situations
involving persons and concepts that are relevant to us,
rather than remembering an overwhelming myriad of
meaningless detail.”
“The full recollection of a single memory episode
requires links between different but associated concepts
… If two concepts are related, some of the neurons
encoding one concept may also fire to the other one.”
Copyright © 2013 Thematix Partners LLC
9
Visual perception - Neural coding
from the University of Leicester,
Department of BioEngineering
9
+ Definitions

An ontology is a specification of a conceptualization.
– Tom Gruber

Knowledge engineering is the application of logic and ontology to the task of
building computable models of some domain for some purpose. – John Sowa

Artificial Intelligence can be viewed as the study of intelligent behavior
achieved through computational means. Knowledge Representation then is
the part of AI that is concerned with how an agent uses what it knows in
deciding what to do. – Brachman and Levesque, KR&R

Knowledge representation means that knowledge is formalized in a symbolic
form, that is, to find a symbolic expression that can be interpreted. – Klein and
Methlie

The task of classifying all the words of language, or what's the same thing, all
the ideas that seek expression, is the most stupendous of logical tasks.
Anybody but the most accomplished logician must break down in it utterly;
and even for the strongest man, it is the severest possible tax on the logical
equipment and faculty. – Charles Sanders Peirce, letter to editor B. E. Smith of
the Century Dictionary
Copyright © 2013 Thematix Partners LLC
10 10
+ What is an Ontology?
An ontology specifies a rich description of the

Terminology, concepts, nomenclature

Properties explicitly defining concepts

Relations among concepts (hierarchical and lattice)

Rules distinguishing concepts, refining definitions and relations (constraints,
restrictions, regular expressions)
relevant to a particular domain or area of interest.
Copyright © 2013 Thematix Partners LLC
11 11
+ Logic and Ontology

Predicate logic is harder to read than the original English, but is
more precise:
12 12
Every semi-trailer truck has at least 3 axles.
(∀ x)(((SemiTrailerTruck(x) ∧ (∃ y)(SemiTrailer(y) ∧ (hasPart (x,y))) ∧
(SemiTrailerTruck(x) ∧ (∃ z)(TractorUnit(z) ∧ (hasPart (x,z))))
⊃ (∃ s)(set(s) ∧ (count(s,(≥3))
∧ (∀ w)(member(w,s) ⊃ (Axle(w) ∧ hasPart(x,w))) )).

Logic is a simple language with few basic symbols.

The level of detail depends on the choice of predicates – these
predicates represent an ontology of the relevant concepts in the
domain.

Different choices of predicates represent different ontological
commitments.
* Derived from Knowledge Representation: Logical, Philosophical, and Computational Foundations,
John F. Sowa, Brooks/Cole, Pacific Grove, CA, 2000.
Copyright © 2013 Thematix Partners LLC
+ Ontology-based Technologies

Ontologies provide a common vocabulary for use by independently
developed resources, processes, services

Agreements among organizations sharing common services can be
made with regard to their usage; the meaning of relevant concepts can
be expressed unambiguously

By composing / mapping ontologies and mediating terminology across
participating events, resources and services, independently-developed
services can work together to share information and processes
consistently, accurately, and completely

Ontologies also ensure

Valid conversations among agents to collect, process, fuse, and
exchange information

Accurate searching by ensuring context using concept definitions
and relations instead of/in addition to statistical relevance of
keywords
Copyright © 2013 Thematix Partners LLC
13 13
+ Features of KR languages

Vocabulary – a collection of symbols







Domain-independent logical symbols (e.g., ∀ or ⊃)
Domain-dependent constants, identifying individuals, properties, or
relations in the application domain or universe of discourse
Variables, whose range is governed by quantifiers
Punctuation that separates or groups other symbols
Syntax –

formation rules that determine how symbols can be combined in wellformed expressions

rules may be stated in a linear grammar, graph grammar, or independent
abstract syntax
Semantics –

a theory of reference that determines how the constants and variables
are associated with things in the universe of discourse

a theory of truth that distinguishes true statements from false statements
Rules of Inference –

rules that determine how one pattern can be inferred from another

if the logic is sound, the rules of inference must preserve truth as
determined by the semantics
Copyright © 2013 Thematix Partners LLC
14 14
+ Logic
Logics vary from classical FOL along six dimensions:






Syntax
Subsets limit permissible operators or combinations, e.g., propositional logic
(without quantifiers), Horn-clause (excludes disjunctions in conclusions, such as
Prolog), terminological or definitional logics (containing additional restrictions,
e.g., description logics)
Proof Theory restricts or extends permissible proofs:
•
Intuitionistic logic and relevance logic rule out certain extraneous information
•
Non-monotonic logics allow introduction of default assumptions
•
Access-limited logic restricts the number of times a proposition can be used in a proof;
Linear logic allows a proposition to be used only once
•
Modal logic incorporates modal auxiliaries (◊p means p is possibly true; 
p means p is
necessarily true), temporal logic extends model logic to include always, sometimes
•
Intensional logics express concepts such as need, ought, hope, fear, wish, believe, know,
expect, and intend
Model Theory modifies the truth value of statements in terms of some model of the
world: classical FOL is two-valued; a three-valued logic introduces unknowns;
fuzzy logic uses the same notation as FOL but with an infinite range of certainty
factors (1.0 to 0.0)
Ontology – frameworks may include support for built-in components, such as set
theory or time
Meta-language – language for encoding information about objects
Copyright © 2013 Thematix Partners LLC
15 15
+ Description logic

A family of logic-based Knowledge Representation formalisms



Distinguished by



Descendants of semantic networks and KL-ONE
Describe domain in terms of concepts (classes), roles (relationships), and
individuals (instances)
Formal semantics
• Decidable fragments of FOL
• Closely related to Propositional, Modal, and Dynamic Logics
Provision of inference services
• Sound and complete decision procedures for key problems
• Implemented systems (highly optimized)
Applications include

Configuration – product configurators, consistency checking, constraint
propagation, first significant industrial application (e.g., CLASSIC)

Ontologies – ontology engineering (design, maintenance, integration),
reasoning with ontology-based mark-up, service description and discovery

Databases – consistency of conceptual schemata, schema integration, query
subsumption (w.r.t. conceptual schemata)
Copyright © 2013 Thematix Partners LLC
16 16
+ Knowledge bases, databases, & ontology

An ontology is a conceptual model of some aspect of a particular
universe of discourse (or of a domain of discourse)

Typically, ontologies contain only “rarified” or “special” individuals,
metadata, representing elemental concepts critical to the domain

A knowledge base is a persistent repository for

Ontology & metadata representing individuals, facts, & rules about how they
can be combined or relate to one another

Metadata, facts & rules only – in some applications and frameworks the
ontology is separately maintained

Most inference engines require in-memory deductive databases for
efficient reasoning (including commercially available reasoners)

A knowledge base may be implemented in a physical, external
database, such as a relational database, but reasoning is typically
done on a subset (partition) of that knowledge base in memory
Copyright © 2013 Thematix Partners LLC
17 17
+ Reasoning & Truth Maintenance
18 18

Reasoning is the mechanism by which the assertions made in an
ontology and related knowledge base are evaluated by an inference
engine.

In classical logic, the validity of a particular conclusion is retained
even if new information is received.

This may change if some of the preconditions are actually
hypothetical assumptions invalidated by the new information.

The same idea applies for arbitrary actions – new information can
make preconditions invalid.

Generally, there are two issues that a reasoner must address:



If some conclusion is invalid, which other conclusions are also invalid?
If some action cannot be performed, which others are at risk?
The “housekeeping” associated with tracking the threads that support
answering these questions is called truth maintenance.
Copyright © 2013 Thematix Partners LLC
+ Negation

If all new information is “positive”, then all prior conclusions should
remain valid.

Problems arise if new information negates a prior assumption,
causing it to be withdrawn.

What does it mean to negate (withdraw) an assumption?




Conclusive information is not available?
The assumption cannot be proven?
The assumption is not provable using certain methods?
The assumption is not provable given a fixed quantity of time?

The answers to these questions can result in different definitions of
negation and differing interpretations by non-monotonic reasoners.

Solutions include chronological and “intelligent” backtracking
algorithms, heuristics, circumscription algorithms, justification or
assumption based retraction, and so forth, depending on the reasoner
and methods used for truth maintenance.

Reasoning efficiency is dependent, in part, on the algorithms applied
for truth maintenance.
Copyright © 2013 Thematix Partners LLC
19 19
+ Explanations and Proofs

When a reasoner draws a particular conclusion, many users and
applications want to understand why?

Primary motivations include interoperability, reuse, and trust

Understanding the provenance of the information and results is
crucial, especially when web-based information is involved


What information sources were used (source)

How recently they were updated (currency)

How reliable these sources are (authoritativeness)

Was the information directly available or derived, and if derived, how
(method of reasoning)
Methods used to explain why a reasoner reached a particular
conclusion include explanation generation and proof specification
Copyright © 2013 Thematix Partners LLC
20 20
+ Analysis approaches

Definitions range from high-level mind
mapping and brainstorming … to detailed
collaboration, dialog, and information
modeling to support knowledge sharing

Tools are equally diverse, from
inexpensive brainstorming tools to
sophisticated ontology and software model
development environments

Common capabilities include

“drawing a picture” that includes concepts
and relationships between them

producing sharable artifacts, that vary
depending on the tool – often including web
sharable drawings
Copyright © 2013 Thematix Partners LLC
21 21
+ Abstraction Layers
Contextual
• Identify subject areas
Ontology
Conceptual
Logical
Physical
Definition
Instance
• Define the meaning of things in the organization
• Describe the logical representation of properties
Ontology, Conceptual ER, Conceptual
Business Process …
• Describe the physical means by which data is stored
ER, Relational, XML Schema
• Represent the coding language on a specific development platform
XML, source code, scripting languages,
stored procedures…
• Hold the values of the properties applied to the data in a schema
Physical KBs,
databases, asset repositories…
*Layering diagram courtesy Kenn Hussey

Knowledge Representation / Management for Large Scale Applications
 Provide broad metadata, process, service & asset management facilities (including
feedback/lessons learned…)
 Enable rich cross-domain, cross-process, cross organizational modeling supported by
mapping & transformation services to provide maximum flexibility, interoperability
 Leverage standards and best practices in information architecture, metadata modeling,
management, registration, and governance, and asset management & registration
 Provide incremental reasoning capabilities for model validation, transformation services

Repeatable, reusable, interoperable
Copyright © 2013 Thematix Partners LLC
22 22
+ Hypothetical Conceptual Model “EU-Rent”
23 23
Produced using Embarcadero EA/Studio Business Modeler Edition, courtesy Kenn Hussey
Copyright © 2013 Thematix Partners LLC
+ Hypothetical Logical Model “EU-Rent”
24 24
Produced using Embarcadero ER/Studio, courtesy Kenn Hussey
Copyright © 2013 Thematix Partners LLC
+ Hypothetical Physical Model “EU-Rent”
25 25
Produced using Embarcadero ER/Studio, courtesy Kenn Hussey
Copyright © 2013 Thematix Partners LLC
+ Classifying ontologies
26 26
Classification techniques are as diverse
as conceptual models; and generally
include understanding
Level of Expressivity

Level of Complexity / Structure

Granularity

Target Usage, Relevance

Amount of Automation, Reasoning Requirements

Prescriptive vs. Descriptive / Reliability / Level
of Authoritativeness

Design Methodology

Governance

Vocabulary Management, Metrics
KR System
OO Software Model
Entity – Relationship
Model
Concept Map
Topic Map
Level of Expressivity

Database Schema
XML Schema
Hierarchical Taxonomy
Simple Taxonomy
Glossary
Level of Complexity
Copyright © 2013 Thematix Partners LLC
+ Framework of dimensions


27 27
Semantic Dimensions

Expressiveness: represents how well a KR language addresses
increasingly complex semantics

Structure: represents how well an ontology encodes semantics, with
the same or less expressivity than the KR language

Granularity: represents the level of detail specified in an ontology
Pragmatic Dimensions

Intended use: the original use case(es), or purpose for developing a
particular ontology

Automated reasoning: the extent to which the ontology is designed
to be used for automated reasoning

Prescriptive vs. Descriptive: the extent to which an ontology was
intended to be used for descriptive purposes vs. normative
prescriptive use (i.e., with high degree of concern for correctness)
Reference: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2007
Copyright © 2013 Thematix Partners LLC
+ Considerations

Intended use of ontologies, including domain requirements (e.g.,
scientific and engineering apps require formulas, units of measure,
computations that may be challenging to represent)

Intended use of KRSs that implement them, including reasoning
requirements, questions to be answered

For distributed environments, the number and kinds of resources,
processes, services requiring ontologies – how distributed, how
unique, developed collaboratively or independently, dynamic
community participation or static

What kinds of transformations are required among processes,
resources, services to support semantic mediation

Ontology and KRS alignment / de-confliction / ambiguity resolution
requirements

Ontology and KRS composition requirements, dynamic vs. static
composition, in what environment and under what constraints

Performance, sizing, timing requirements of target environment
Copyright © 2013 Thematix Partners LLC
28 28
+ A little methodology …

Requirements, domain & use case analysis are critical




Need to understand and communicate



Architectural trade-offs, cost & technical benefits
The nature of the information & kinds of questions that need to be
answered drive the architecture, approach, and ontology scoping
and design
Use discipline from formal domain analysis and use case
development in UML to




Develop initial source/reference material
Focus on system or application requirements
Iterative development starting with a “thread” that covers basic
capabilities can ground the work and prioritize decisions
document and explain requirements
identify requisite information sources and ontologies needed
limit scope creep
Reuse standards and well-tested, available ontologies
whenever possible
Copyright © 2013 Thematix Partners LLC
29 29
+ What to look for

A controlled vocabulary
FAA & IATA airport codes, ACRISS car codes, …

A hierarchical or taxonomic structure (for query expansion)
Vehicle, Ground-based Vehicle, Wheeled Vehicle, Powered Vehicle, Automobile,
Sedan…

Knowledge supporting structured queries
Find all available hybrid SUVs or hybrid sedans that can seat four adults within a
reasonable taxi ride of Planet Hollywood, Las Vegas for 3-4 days the week of
June 20th

Efficient inference (i.e., limited expressive power) vs. increased
expressivity (potentially expensive or resource bounded
computation)

Custom reasoning for temporal relations, geospatial, dynamic
algorithm / equation evaluation, process-specific, conditional
operations

Computational tractability
Copyright © 2013 Thematix Partners LLC
30 30
+ Using use cases to gather requirements

A good summary for every use case should include:







A description of the basic business requirement / need the use case is intended to support
Primary goals
Scope – identify any known boundaries as a starting point
Pre-conditions and post conditions – any assumptions you know about the state of the
“system”/world before and after
Actors and interfaces – identify primary actors, information sources, interfaces to existing or
new systems
Triggers – what kicks off the use case, any particular series of events, situation, activity, etc., and
any that affect the flow
Performance requirements – including any sizing or timing constraints, “ilities” , etc.

Outline the major process steps for both the “normal”, or primary scenario, and alternative
flows, such as if things don’t go well

Use case and activity diagrams – typically done in UML, but could be visio, power point, or
whatever tools your team is comfortable with

Usage scenarios – you should have at least two narrative “stories” that describe how one of
the main actors would experience the use case, with the intent of identifying additional
requirements

Competency Questions – identify as many of the questions you want the ontology /
knowledge base to answer as possible

Resources – describe any known contributing knowledge bases, other external resources
that will be participating in the use case to the degree possible
Copyright © 2013 Thematix Partners LLC
31 31
+ Start with canonical definitions

General concepts as well as domain-specific knowledge

Basic starting point – cross-domain definitions



Namespace definitions, metadata, naming conventions, governance policies
Commonly used structures & vocabularies, such as domain-specific
vocabulary & messaging standards, international country & language codes
(ISO), national postal addressing or other government standards, industry
best practices
Common metadata for ontology & schema management (e.g., Dublin Core,
for documents & models, ISO 1087 for synonyms & similar relations, MIME
media types for images & multimedia, etc.)

Domain vocabularies must be prioritized, selected based on business
requirements, clear ROI

Common early targets include





Smart search (pull); customer experience & cross-sell / upsell (push)
Richer interoperability among trading partners
Service registration, description, discovery & management
Asset/artifact repository search & retrieval
Automated verification
Copyright © 2013 Thematix Partners LLC
32 32
+ Capturing definitions



Layout a high-level architecture for key ontologies and ontology elements
Identify the relationships among elements – roles, domain, interface,
process, utility
Define an approach for gathering content from subject matter experts,
possibly based on IDEF5 (Integrated Definition Methods) Ontology
Capture Method Analysis, that includes




For each ontology element






Understanding and documenting source materials
An interview template
Traceability back to your use cases
Describe its domain and scope, how it will be used
Identify example questions and anticipated/sample answers for the application(s)
it will support
Identify key stakeholders, ownership, maintenance, resources for instance
knowledge
Describe anticipated reuse/evolution path
Identify critical standards, resources that it must interoperate with, dependencies
Resources


http://www.idef.com/IDEF5/html
http://www.kbsi.com/technology/methods/sbont.htm
Copyright © 2013 Thematix Partners LLC
33 33
+ Terminology analysis





ISO 704, Principles & Methods for Terminology Work,
provides a methodology for describing concepts & terms

Uses ISO 1087 for terminology

Uses ISO 860 for terminology “harmonization” (alignment)
methods

Basis for typical methods used for taxonomy development today
Describes how to flesh out definitions
Recommendations strategies for relating terms to one
another using standard vocabulary
ISO 1087 – great resource for language to describe kinds of
relationships, acronyms & other designations, preferred vs.
deprecated terms, etc.
ISO 860 augments this with recommendations for vocabulary
comparison
Copyright © 2013 Thematix Partners LLC
34 34
+ Modularity


Guidance on modularity is limited

When is any model “too big”?

Can individual parts evolve independently, and if so, shouldn’t that dictate
module boundaries?

Can individual parts be used independently, and if so, shouldn’t that also
dictate module boundaries?

For ontologies, particularly OWL ontologies, individuals should be
managed separately from “schema” to facilitate reasoning during
development (i.e., if you add an individual that creates a logical
inconsistency, most reasoners won’t load the ontology at all)
Current practices in the semantic web community address
modularity either statically or dynamically

Static approaches include adherence to “DL-Safe” rules and suggestions
in papers by Alan Rector (University of Manchester)

Dynamic approaches include use of tools that can assist in determining
whether or not an ontology is appropriately modularized, using reasoning
to perform rewriting with respect to soundness and completeness for a
given ontology
Copyright © 2013 Thematix Partners LLC
35 35
+ Separation of concerns
36 36

Critical dimensions aid in determining module
boundaries –




separate business-related content from technical detail
aspects of the business content, such as branding, from others
describing distribution rights
separate disciplines into independent modules
Other considerations include separation based on






back-end store / source repositories
application boundaries, system interfaces
distributed resources
the need to reason over some parts of the knowledge base
but not others to answer sets of critical questions
performance requirements, for reasoning, query answering,
etc.
asserted vs. inferred content
Copyright © 2013 Thematix Partners LLC
+ Key methodology issues

Naming conventions and versioning policies are critical for –every–
organization






Namespace definitions for ontologies are critical, along with policies for their
management that are well understood
http://<authority>/<subdomain>/<topic>/<date, in YYYYMMDD
form>/filename.extension is common practice in some organizations, with
use of GRDDL to support content negotiation to the latest version
Levels of hierarchy may be added for large organizations or modularized
ontologies
Use “id.name.ext” or “ontology.name.ext” vs. “www.name.ext,” for the
authority component of the URL is increasing in some communities
Namespace prefixes (abbreviations) for individual modules, especially where
there are multiple modules, can be important
For model elements –


These vary by “content community” – data modelers often use underscores at
word boundaries, spaces in names – which semantic web tools may not handle
well, semantic web practitioners use camel case
Guidelines should be provided to development teams for naming classes or
class-like elements (SBVR concepts), properties or property-like elements
(SBVR roles, for example), individuals (objects in UML) so that conceptual
models developed in UML or a UML profile can be exported consistently for
reuse
Copyright © 2013 Thematix Partners LLC
37 37
+ Best practices in namespace development

Availability – people should be able to retrieve a description about the
resource identified by the URI from the network (albeit internally)

Understandability – there should be no confusion between identifiers for
networked documents and identifiers for other resources



URIs should be unambiguous
URIs are meant to identify only one of them, so one URI can't stand for both a
networked document and a real-world object
Separation of concerns in modeling “subjects” or “topics” and the objects in the
real world they characterize is critical, and has serious implications for designing
and reasoning about resources

Simplicity – short, mnemonic URIs will not break as easily when shared for
collaborative purposes, and are typically easier to remember

Persistence – once a URI has been established to identify a particular
resource, it should be stable and persist as long as possible


Exclude references to implementation strategies, as technologies change over
time (e.g., do not use ‘.php’ or ‘.asp’ as part of the URI scheme), and organization
lifetime may be significantly shorter than that of the resource
Manageability – given that URIs are intended to persist, administration
issues should be limited to the degree possible


Some strategies include inserting the current year or date in the path so that URI
schemes can evolve over time without breaking older URIs
Create an internal organization responsible for issuing and managing URIs, and
corresponding namespace prefixes
Copyright © 2013 Thematix Partners LLC
38 38
+ Best practices in publishing vocabularies

Provide readable documentation, about the vocabulary or model,
including the terms, their definitions, proper usage patterns and
scenarios, and clear examples

Articulate your maintenance policies, so that those depending on the
information model can understand how stable it is, how to provide
feedback to its stewards, and so forth

Identify versions and allocate URIs to the unique information models so
that they can be referenced


Every version of a defined or imported XML schema module, information
model, document, etc., aside from locally defined, internal modules, must have
a unique namespace.
This allows us to align schema versioning with namespace versioning, as
appropriate for some applications.

Publish the formal schema (or ontology)

Provide a means for your community of users to provide feedback
Copyright © 2013 Thematix Partners LLC
39 39
+ Model element naming

For naming entities in an ontology – there are some rules of
thumb that vary by community of practice




data modelers often use underscores at word boundaries, spaces
in names – which semantic web tools may not handle well (ok in
labels)
some name properties <domain><predicate><range>, others
<predicate><range>, & others <predicate>, but not necessarily
consistently
semantic web practitioners typically use camel case; property
naming varies by domain & community of practice
for very large ontologies, some use unique identifiers to name
concepts and properties, with human readable names in labels
only – depends on tooling that helps people interpret the content

The key is to establish & consistently use guidelines tailored to your
organization

Guidelines for classes/concepts (e.g., OWL classes, CL names, SBVR
concepts), properties (e.g., OWL properties, CL relations, functions,
SBVR roles), individuals (objects in UML) – so that conceptual
models developed in one tool can be exported consistently to other
tools & formats essential
Copyright © 2013 Thematix Partners LLC
40 40
+ Metadata for ontologies & elements

Metadata should be standardized at both the model level and the
element level for every model (not just ontologies)

Model level metadata can reuse properties from the Dublin Core Metadata
Terms, from the Simple Knowledge Organization System (SKOS), ISO 11179
Metadata Registry standard with ISO 1087 Terminology support, emerging
W3C work on provenance and others

Most annotations should be optional at the element level, but a minimal set,
including names, labels, and formal, text definitions, is important for reusability
& collaboration

Consistent use of the same annotations (properties, tags) improves readability,
facilitates automated documentation generation, and enables better search
over ontology repositories

Model level metadata may reuse organization-specific taxonomies to enable
better search through RDFa tagging, for example

Latest version of an OMG architecture board recommended vocabulary for
specification metadata for use in OMG standards is available to members at
http://www.omg.org/cgi-bin/doc?ab/2013-02-02

Specifications up for review at the OMG March Technical Meeting, in Reston,
including the Information Exchange Framework (IEF) Packaging Policy
Vocabulary (IEPPV) and Financial Industry Business Ontology (FIBO), use this
Copyright © 2013 Thematix Partners LLC
41 41
+ Change management

Change management and traceability is not well defined at the
element level for ontologies, still a research topic

Approach to element-level versioning to support reasoning is to track two
parallel streams: one for additions to the ontology, one for retractions; often
managed as two separate files

MOF versioning suggests linking to a workspace; use of a default workspace
would get the latest version, and tools such as MagicDraw© can provide an
indication of the differences

MOF versioning does not support reasoning about the differences so that
users can determine the impact of applying the changes, which is frequently
required in the semantic web community –


addition or deletion of axioms can change downstream reasoning results, especially
where there are complex dependencies
Mechanisms that preserve the versioning detail in generated artifacts, such
as RDF/XML serialized OWL, are essential

Consider the use cases/justification for whatever level of versioning detail is
needed on a project by project basis

Balance with usability/performance
Copyright © 2013 Thematix Partners LLC
42 42
+
Part 2: OWL Basics

A quick intro to RDF and RDF Schema

Basic constructs: descriptions & classes

Class expressions

Properties & restricting property values

Individuals & data ranges

Miscellaneous class axioms

New features of OWL 2

A few rules of thumb



Depth vs. breadth, naming, synonymous terms
Class vs. property value, class vs. individual
Limiting scope

Motivation for using UML

Ontology development using the Ontology Definition Metamodel (ODM)

Relationships to other OMG & ISO Standards
Copyright © 2013 Thematix Partners LLC
43 43
+ Semantic Web
"The Semantic Web is an extension of the current web in which
information is given well-defined meaning, better enabling computers
and people to work in cooperation."
-- Tim Berners-Lee
Copyright © 2013 Thematix Partners LLC
44 44
+ Guiding Principles
45 45

Historically, knowledge representation and reasoning systems have
operated under closed world assumptions

Uncertainty is magnified under open-world, “wild, wild web” conditions,
making reasoning much more difficult

Semantic web languages are designed to support less certainty, to
provide “better” search results, informed answers to questions, not
absolutes

Because they are based on XML, such languages can assist businesses in
leveraging existing investment in mark-up, content, and data




To augment business intelligence/analysis and knowledge mining
To support knowledge sharing and collaboration, augment enterprise
information integration
Enrich web services and other applications
Support policy-based applications and ensure compliance with policy
at a lower cost with higher potential ROI than traditional computing
methods
Copyright © 2013 Thematix Partners LLC
+ Resource Description Framework (RDF)

Describes relationships

Uses URIs used for naming

Language has




Specification, W3C presentations, tools are available at




graph based model
RDF/XML serialization (exchange syntax)
other presentation syntaxes (N3, Turtle, …)
Semantic Web: http://www.w3.org/standards/semanticweb/
Linked Data: http://www.w3.org/standards/semanticweb/data
RDF: http://www.w3.org/standards/techs/rdf#w3c_all
RDF “next steps” revision working group is making specific enhancements,
“fixes” to the standard

RDF Working Group: http://www.w3.org/2011/rdf-wg/wiki/Main_Page

Candidate Recommendation for a Turtle (Terse RDF Triple Language)
specification – published 19 Feb 2013, at http://www.w3.org/TR/2013/CRturtle-20130219/

Current Working Draft for RDF 1.1 Concepts and Abstract Syntax –
published 15 Jan 2013, at http://www.w3.org/TR/2013/WD-rdf11-concepts20130115/
Copyright © 2013 Thematix Partners LLC
46 46
+ RDF Notation Options
47 47
Subject
http://www.omg.org/news/meetings/tc/dc-13/TechnicalMeeting#Conference
Graph:
Object
XML/RDF:
N3:
Predicate
http://www.omg.org/news/meetings/tc/dc-13/TechnicalMeeting#heldAt
http://www.omg.org/news/meetings/tc/dc-13/TechnicalMeeting#HyattRegencyReston
<rdf:Description rdf:ID=“Conference"/>
<omg-tm:heldAt rdf:resource="#HyattRegencyReston"/>
</rdf:Description>
omg-tm:Conference omg-tm:heldAt omg-tm:HyattRegencyReston .
Copyright © 2013 Thematix Partners LLC
+ RDF Schema (RDFS)

An RDF vocabulary that provides for identifying:

classes,

subsumption (inheritance) relations for classes,

subsumption (inheritance) relations for properties,

domain and range for properties
Copyright © 2013 Thematix Partners LLC
48 48
+ RDF Schema (RDFS)
49 49
Graph:
omg-tm:Hotel
rdf:type
rdfs:Class
rdf:type
rdfs:subClassOf
omg-tm:ConferenceHotel
rdf:type
omg-tm:HyattRegencyReston
XML/RDF:
<rdf:Description rdf:ID="Hotel">
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
</rdf:Description>
<rdfs:Class rdf:ID=“ConferenceHotel">
<rdfs:subClassOf rdf:resource="#Hotel"/>
</rdfs:Class>
<omg-tm:ConferenceHotel rdf:ID=“HyattRegencyReston“/>
Copyright © 2013 Thematix Partners LLC
+ The Web Ontology Language (OWL)

Two languages emerged in parallel to address semantic web
requirements


DAML-ONT, supported by the DARPA/DAML program
OIL (Ontology Inference Layer) developed by EU & US researchers

Merged DAML+OIL was submitted to the W3C 2002, formed the basis
for the WebOnt Working Group

OWL extends RDF Schema


Has an RDFS based syntax and reuses some RDF vocabulary (e.g., subClassOf,
domain, range)
Adds rich primitives and redefines others (transitivity, inverse, cardinality
constraints, complex class definitions)

Describes the structure of a domain in terms of classes and properties

Uses RDFS for class/property membership assertions (ground facts),
XML Schema Datatypes

OWL specifications became W3C recommendations in February 2004

OWL 2 specifications was adopted in October 2009, with minor
revisions in December 2012
Copyright © 2013 Thematix Partners LLC
50 50
+ Describing classes

A class is a concept in the domain



Vintage – a wine made from grapes grown in a specified year
A set of characteristics (flavor, body, color, sugar…)
A class is a collection of elements with similar properties

White wine – wines made from white grapes

White table wine – wines made from white grapes that are not
appellations or regional (not “quality wine” in the EU)

A class expression defines necessary conditions for
membership (specific varietal, field, micro-climate, date picked,
date bottled)

Instances of classes



Daniel Gehrs 2009 Merlot
Robert M. Parker, Jr,’s Wine Advocate review, dated February 28, 2002
Los Olivos, Santa Barbara County, California, USA
Copyright © 2013 Thematix Partners LLC
51 51
+ Class inheritance


Classes are organized into subclass-superclass (or
generalization-specialization) hierarchies
52 52
True subclass relationships are the basis of a formal is-a
hierarchy



Classes are “is-a” related if an instance of the subclass is an instance of
the superclass
is-a hierarchies are essential for classification
Violation of true is-a hierarchical relationships can have unintended
consequences & and cause errors in reasoning

Class expressions should be viewed as sets, and subclasses as a
subset of the superclass, in contrast with a collection of attributes
as classes are sometimes specified in object oriented
programming

Examples



FloweringPlantType is a subclass of PlantType
Every flowering plant is a plant; every instance of a flowering plant (e.g.,
Azalea indica 'Alaska' (Rutherfordiana hybrid) is an instance of a
flowering plant
Azalea is a subclass of FloweringPlantType, Rhododendron
Monrovia is a company that breeds, grows, and sells flowering plants
Copyright © 2013 Thematix Partners LLC
* courtesy Monrovia web site
+ Class expressions
53 53

6 primary kinds of class expressions in OWL

5 anonymous

Definitions can be nested using set-theoretic constructs
Copyright © 2013 Thematix Partners LLC
+ Example class hierarchy with other expressions
54 54


Example from ISO 1087 combining subclass-superclass and union
expressions
Provides a more accurate representation of the relationships
defined in the ISO standard than a strict is-a hierarchy would
Copyright © 2013 Thematix Partners LLC
+ Example class hierarchy with other expressions
55 55
Copyright © 2013 Thematix Partners LLC
+ Modeling considerations for class hierarchy
development

Practitioners tend to start




Top-down - define the most general concepts first and then
specialize them
Bottom-up - define the most specific concepts and then organize
them into more general classes
Combination (typical – breadth at the top level and depth along a
few branches to test design)
Class inheritance is transitive




A is a subclass of B (white wine, dessert wine are subclasses of
wine)
B is a subclass of C (sauvignon blanc is a subclass of white wine,
late harvest wine is a subclass of dessert wine)
therefore A is a subclass of C (late harvest sauvignon blanc is a
subclass of white wine, dessert wine, & wine)
Start with a more formal is-a approach, tease out whether other
expressions are more appropriate based on use cases, formal
definitions when available, etc.
Copyright © 2013 Thematix Partners LLC
56 56
+ Class axioms

57 57
Subsumption (necessary)

A ⊆ B where B
is a class description
B
partial or primitive class
A

Definition (necessary and sufficient)

C ≡ D where D
is a class description
D
complete or defined class
C
Courtesy Evan Wallace, NIST
Copyright © 2013 Thematix Partners LLC
+ Disjoint classes
A
58 58
B

Classes are disjoint if they cannot have common instances

Disjoint classes cannot have any common subclasses

If winery and wine are disjoint, then there is no instance that is
both a winery and a wine; there is no class that is both a
subclass of winery and a subclass of wine

Disjointness is often used to aid consistency checking

Disjointness is also helpful in teasing out subtle distinctions
among classes across multiple ontologies

Equivalence is also often used to identify the same concepts
across ontologies that may be named differently, or to name
classes defined through class axioms
Copyright © 2013 Thematix Partners LLC
+ Properties

Properties describe characteristics, features, or attributes of the
members of a class
Every flowering plant has a bloom color, color pattern, flower form, petal
form, etc.

Classes of properties





intrinsic: properties of the plant itself such as the optimal sunlight, soil
conditions, expected height, Sunset climate zone, and so forth
extrinsic: properties imposed externally such as the grower and price
whole-part relations
geospatial, mereonomic relations: what microclimates are this plant well
suited to, where in a particular historic garden will you find it
Data and object properties



simple attributes typically have primitive data values (e.g., strings,
numbers)
complex properties refer to other entities (e.g., an individual grower,
such as Monrovia, or organization such as the American Orchid Society)
OWL reasoning requires strict separation of data and object properties
Copyright © 2013 Thematix Partners LLC
59 59
+ Properties in OWL 2
60 60
Copyright © 2013 Thematix Partners LLC
+ Domain & range properties

In OWL and many other KR languages, relations (properties) are
strictly binary

The domain & range represent the source & target arguments,
respectively, for the property

Domain – the class (or classes) that may have the property − Wine is
the domain of the property hasWineColor

Range – the class (or classes) defining valid property values −
everything that fulfills the hasWineColor property is an instance of the
enumerated class {red, white, rose}

Some KR languages that inherently support n-ary relations, such as
CL, do not make this distinction

More flexible, intuitively more like mathematics, where functions have
ranges (or return types) but not all relations are functions

Requires additional relations to specify argument order, which can be
critical for ontology alignment
Copyright © 2013 Thematix Partners LLC
61 61
+
Meta-properties – global cardinality restrictions
62 62
Range

Functional
Domain
Range

Inverse Functional
Courtesy Evan Wallace, NIST
Domain
Copyright © 2013 Thematix Partners LLC
+ Class & property inheritance

A subclass inherits all the properties from its super class
If a wine has a name and flavor, a red wine also has a name and flavor

If a class has multiple super classes, it inherits properties and
restrictions from all of them
Latin is both an individual language and an ancient language. It inherits “has
attested literature: true” from ancient language and “has indigenous
name: lingua latina” from individual language
Copyright © 2013 Thematix Partners LLC
63 63
+ Class expressions that restrict property values


Number restrictions describe or limit the number of possible
values a particular property can have

A language must be associated with at least one English name and at
least one French name

A language may be associated with zero or more Indigenous names
Cardinality – similar meaning to classical set theory, measures
the number of elements in the set (restriction class)

Cardinality – cardinality N means the class defined by the property
restriction must have exactly N values (individual or literal values)

Minimum cardinality - 1 means that there must be at least one value
(required), 0 means that the value is optional

Maximum cardinality - 1 means that there can be at most one value
(single-valued), N means that there can be up to N values (N > 1,
multi-valued)
Copyright © 2013 Thematix Partners LLC
64 64
+ Simple number restriction examples
<owl:ObjectProperty rdf:about="&re;hasColor">
<rdfs:range rdf:resource="&re;Color"/>
<rdfs:domain rdf:resource="&re;ColoredThing"/>
</owl:ObjectProperty>
65 65
<owl:Class rdf:about="&re;Color"/>
<owl:Class rdf:about="&re;ColoredThing"/>
<owl:Class rdf:about="&re;MultiColoredThing">
<owl:equivalentClass>
<owl:Restriction>
<owl:onProperty rdf:resource="&re;hasColor"/>
<owl:minCardinality
rdf:datatype="&xsd;nonNegativeInteger">2</owl:minCardinality>
</owl:Restriction>
</owl:equivalentClass>
</owl:Class>
<owl:Class rdf:about="&re;SingleColoredThing">
<owl:equivalentClass>
<owl:Restriction>
<owl:onProperty rdf:resource="&re;hasColor"/>
<owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:cardinality>
</owl:Restriction>
</owl:equivalentClass>
</owl:Class>

Creating a class of individuals that have exactly one color & a class of
individuals that have more than one color

Note that the relationship between SingleColoredThing and the
restriction class is modeled as an equivalence relationship, meaning that
membership in the restriction class is a necessary and sufficient
condition for being a SingleColoredThing
Copyright © 2013 Thematix Partners LLC
+ Qualified cardinality number restriction example
66 66

OWL 2 provides the capability to further qualify cardinality by specific
classes and data ranges

Patterns that reuse a smaller number of significant properties, building
up more complex class expressions that limit the set of valid values are
common, useful in complex classification systems
Copyright © 2013 Thematix Partners LLC
+ Qualified cardinality number restriction example
<owl:ObjectProperty rdf:about="&re;hasCharacteristic">
<rdfs:range rdf:resource="&re;Characteristic"/>
</owl:ObjectProperty>
<owl:Class rdf:about="&re;BloomColor">
<rdfs:subClassOf rdf:resource="&re;Characteristic"/>
<rdfs:subClassOf rdf:resource="&re;Color"/>
</owl:Class>
<owl:Class rdf:about="&re;Characteristic"/>
<owl:Class rdf:about="&re;Color"/>
<owl:Class rdf:about="&re;FloweringPlantType">
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="&re;hasCharacteristic"/>
<owl:onClass rdf:resource="&re;BloomColor"/>
<owl:minQualifiedCardinality
rdf:datatype="&xsd;nonNegativeInteger">1</owl:minQualifiedCardinality>
</owl:Restriction>
</rdfs:subClassOf>
</owl:Class>
Copyright © 2013 Thematix Partners LLC
67 67
+ Class expressions that restrict possible values
by type

Universal (allValuesFrom) and existential (someValuesFrom)
quantification
 allValuesFrom is the OWL equivalent for ∀(x), and restricts the possible


values for a property to members of a particular class or data range
someValuesFrom is the OWL equivalent for ∃(x), and restricts the possible
values for a property to at least one member of a particular class or data
range
Specific value (hasValue) restrictions


a single data value (e.g., the color property for a RedWine must be filled
with the value “red”)
an individual member of a class (e.g., Winery is the value restriction on
the hasMaker property on the class Wine)

Enumerations – lists of allowable individuals or data elements

Combinations of the above that include both restrictions and boolean
class expressions (union – logical or, intersection – logical and,
complement – logical not)
Copyright © 2013 Thematix Partners LLC
68 68
+ Class expression restricting all values for a given
property
69 69
* courtesy Azalea Society of America

Azaleas are members of the set of flowering plants whose bloom color
must be either an RHSColor (Royal Horticultural Society) or ASAColor
(Azalea Society of America)
Copyright © 2013 Thematix Partners LLC
+ And the OWL for that …
<owl:Class rdf:about="&re;Azalea">
<rdfs:subClassOf rdf:resource="&re;FloweringPlantType"/>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="&re;hasBloomColor"/>
<owl:allValuesFrom>
<owl:Class>
<owl:unionOf rdf:parseType="Collection">
<rdf:Description rdf:about="&re;ASAColor"/>
<rdf:Description rdf:about="&re;RHSColor"/>
</owl:unionOf>
</owl:Class>
</owl:allValuesFrom>
</owl:Restriction>
</rdfs:subClassOf>
</owl:Class>
* courtesy Azalea Society of America
Copyright © 2013 Thematix Partners LLC
70 70
+ Another typical pattern combining restrictions &
boolean connectives

Use of equivalence expressions like this, describing
RHSFloweringPlantType as a flowering plant and something whose
bloom colors must be from the set of colors defined by the Royal
Horticultural Society is a common pattern
Copyright © 2013 Thematix Partners LLC
71 71
+ And the corresponding OWL for that …
72 72
<owl:Class rdf:about="&re;RHSFloweringPlantType">
<owl:equivalentClass>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<rdf:Description
rdf:about="&re;FloweringPlantType"/>
<owl:Restriction>
<owl:onProperty
rdf:resource="&re;hasBloomColor"/>
<owl:allValuesFrom
rdf:resource="&re;RHSColor"/>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
</owl:equivalentClass>
</owl:Class>
Copyright © 2013 Thematix Partners LLC
+ Class expression restricting some & exact values
for a given property
Copyright © 2013 Thematix Partners LLC
73 73
+ And the OWL for that …
<owl:Class rdf:about="&re;Azalea">
<rdfs:subClassOf rdf:resource="&re;FloweringPlantType"/>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="&re;hasBloomColor"/>
<owl:allValuesFrom>
<owl:Class>
<owl:unionOf rdf:parseType="Collection">
<rdf:Description rdf:about="&re;ASAColor"/>
<rdf:Description rdf:about="&re;RHSColor"/>
</owl:unionOf>
</owl:Class>
</owl:allValuesFrom>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="&re;hasBloomColorPattern"/>
<owl:someValuesFrom rdf:resource="&re;ASAColorPattern"/>
</owl:Restriction>
</rdfs:subClassOf>
</owl:Class>
Copyright © 2013 Thematix Partners LLC
74 74
+ ...
<owl:Class rdf:about="&re;SingleColoredAzalea">
<owl:equivalentClass>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<owl:Restriction>
<owl:onProperty rdf:resource="&re;hasBloomColorPattern"/>
<owl:hasValue rdf:resource="&re;Solid"/>
</owl:Restriction>
<owl:Restriction>
<owl:onProperty rdf:resource="&re;hasBloomColor"/>
<owl:cardinality
rdf:datatype="&xsd;nonNegativeInteger">1</owl:cardinality>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
</owl:equivalentClass>
<rdfs:subClassOf rdf:resource="&re;Azalea"/>
</owl:Class>
<owl:NamedIndividual rdf:about="&re;Solid">
<rdf:type rdf:resource="&re;ColorPattern"/>
</owl:NamedIndividual>
Copyright © 2013 Thematix Partners LLC
75 75
+ Individuals and data ranges


An individual (instance, object in other paradigms)

Any class that an individual is a member of, or is an individual of, is a type
of the individual

Any superclass of a class is an ancestor of (or type of) the individual
Specify property values for the individual


Property values should conform to the constraints such as range, value
type, cardinality restrictions, etc.
Enumerated classes & data ranges are commonly used to specify
the complete set of valid values for a particular property
Copyright © 2013 Thematix Partners LLC
76 76
+ Example enumerated class of individuals
77 77
Copyright © 2013 Thematix Partners LLC
+ Complex data ranges & restrictions
78 78
Aggregating datatype restrictions via an intersection to specify the average size of an
Azalea in the landscape.
Copyright © 2013 Thematix Partners LLC
+ New features in OWL 2
79 79


New property constructs

Reflexive, irreflexive, & asymmetric properties

Property chains

Disjoint properties

Keys

Self-reflexive properties

Qualified cardinality restrictions

Negative property assertions
New class axioms

Disjoint unions

Disjoint classes
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
+ More new features


Extended datatype handling

Better support for XML Schema Datatypes

New datatypes for OWL specifically (owl:real, owl:rational, rdf:PlainLiteral)

Custom datatype definition & use of xsd facets / datatype restrictions
New data range restrictions


Intersections, unions, complements
Support for punning

Limited support for a particular element to be defined as both a class &
individual, for example

Improved support for annotations (e.g., metadata about ontologies,
provenance, transformation detail, etc.)

See http://www.w3.org/TR/owl2-new-features/ &
http://www.w3.org/TR/owl2-quick-reference/ for detail

Current work specifically focused on provenance:
http://www.w3.org/2011/prov/wiki/Main_Page
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
80 80
+ Rules of Thumb
81 81

Cycles are common in many KR
systems, though rarely “a good thing”

Cycles are disallowed by some tools
because they prohibit “code
generation”, export -- including
RDF/OWL

Classes A, B, and C have equivalent
sets of instances


By many definitions, A, B, and C are
equivalent
Use owl:equivalentClass instead of
creating cycles
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
+ Class hierarchy analysis
82 82

Siblings should be specified at roughly the same level of generality -compare to section and subsections in a book
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
+ More on class definition analysis

Classes that have single subclasses may be incompletely
defined, unless additional children are specified in
subordinate modules

Subclasses of a class typically have more detailed
definitions


Additional properties

Constraints/restrictions on inherited properties

Participate in further qualified relations

Compare to bullets in a bulleted list
If a class has a large number of subclasses, it may be useful
to define intermediate levels

For wine, for example there are natural groupings around wine
color

If no natural classification exists, the long list may be appropriate
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
83 83
+ Inheritance, naming, synonyms
Class

A “wine” is not a subclass of “wines”

A particular vintage should be classified as an
instance of the class Wines

Class names should be either all singular or all
plural

Synonymous names for the same concept should
be modeled as alternate designations, not as
unique classes (in the same ontology)

Many systems & standards facilitate synonymous
term representation (e.g., ISO 1087)

OWL allows defining necessary and sufficiency
condition definitions thereby allowing synonym
definitions to be “first class” terms (as appropriate,
through equivalence / same as relations)
instance-of
Instance
MariettaOldVinesRed
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
84 84
+ Class vs. property value
85 85

Are concepts with distinct property values used in restrictions on
other properties?

How important is the distinction for the domain?

Class definitions for most domains should be fairly stable – i.e.,
they should not change frequently once the definitions are
established and individuals created
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
+ Class vs. individual
86 86

Individuals are the most specific entities in an ontology

If concepts form a natural hierarchy, represent them as classes

If they might have individuals, represent them as classes
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
+ Limiting scope

An ontology should not contain all the possible information about
the domain


No need to specialize or generalize more than the application
requires
No need to include all possible properties of a class
•
•

Only the most salient properties
Only the properties that the application requires
Ontologies of wine, food, and their pairings probably will not
include details such as:





Bottle size (half bottle, full bottle, magnum, …)
Label color
Wine bottle color (green, amber, …)
Individual plants and their location in a historic garden
Azalea indica 'Alaska' (Rutherfordiana hybrid) might be a class or
might be an individual, depending on the use case and application
requirements
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
87 87
+ Why UML for Ontology Modeling?
88 88

A little review:
Every semi-trailer truck has at least 3 axles.
(∀ x)(((SemiTrailerTruck(x) ∧ (∃ y)(SemiTrailer(y) ∧ (hasPart (x,y))) ∧
(SemiTrailerTruck(x) ∧ (∃ z)(TractorUnit(z) ∧ (hasPart (x,z))))
⊃ (∃ s)(set(s) ∧ (count(s,(≥3))
∧ (∀ w)(member(w,s) ⊃ (Axle(w) ∧ hasPart(x,w))) )).
Copyright © 2013 Thematix Partners LLC
+ XML angle brackets can be equally difficult to read
89 89
Copyright © 2013 Thematix Partners LLC
+ Editors such as Protégé are better …
90 90
Copyright © 2013 Thematix Partners LLC
+ As complexity increases, it can be difficult to follow
relationships …
Copyright © 2013 Thematix Partners LLC
91 91
+ And it’s worse with individuals
92 92
Copyright © 2013 Thematix Partners LLC
+ UML’s well-known graphical notation is more
accessible to many
Copyright © 2013 Thematix Partners LLC
93 93
+ A little more background …
94 94

UML provides a graphical notation, but OWL & UML are very
different languages

Language mapping from OWL to UML only covers a fraction of
the UML language – to logical (class) diagrams

Key distinctions:
•
•
•
•
Properties in OWL are first-class citizens, second class in UML
(meaning, it’s difficult to map OWL properties directly to UML
properties or associations)
UML supports n-ary relations whereas in OWL, properties are
typically binary
OWL uses true set theoretic concepts (intersection, union,
complement, etc.), where UML has historically been less formal
Although UML 2.5 semantics are closer to OWL, many practitioners
don’t use the set theoretic features
This is overly simplified – the mapping is not straightforward, but the benefits of
having a graphical notation are acknowledged in the W3C OWL 2 community.
Copyright © 2013 Thematix Partners LLC
+ UML/MOF and KR Together

MOF technology streamlines the mechanics of managing models as XML
documents, Java objects, CORBA objects

Knowledge Representation supports reasoning about resources




Supports semantic alignment among differing vocabularies and nomenclatures
Enables consistency checking and model validation, business rule analysis
Allows us to ask questions over multiple resources that we could not answer
previously
Enables policy-driven applications to leverage existing knowledge and
policies to solve business problems

Detect inconsistent financial transactions

Support business policy enforcement

Facilitate next generation network management and security applications
while integrating with existing RDBMS and OLAP data stores

MOF provides no help with reasoning

KR is not focused on the mechanics of managing models or metadata

Complementary technologies – despite some overlap
Copyright © 2013 Thematix Partners LLC
95 95
+ Bridging KR and MDA
96 96
Copyright © 2013 Thematix Partners LLC
+ Metadata Management Scenarios
97 97
Copyright © 2013 Thematix Partners LLC
+ Ontology Definition Metamodel (ODM)

ODM is Object Management Group’s standard for model driven ontology
development (adopted in October 2006, finalized in May 2009) – available at
http://www.omg.org/spec/ODM/1.0/

A family of metamodels & profiles that enable model interchange, ontology
development in UML 2

Grounded in formal logic enabling reasoning engines to understand, validate, and
apply ontologies developed using the ODM

Mappings to other OMG standards are either in work or under consideration,
including

Information Management Metamodel (IMM) for exchange of ER, logical & physical database
models, use of database schema for ontology development

SysML for exchange of systems engineering models, use of SysML models as a basis for
ontology development, use of standard vocabularies for systems design

BPMN/BMM for use of standardized business vocabularies in business process modeling

Production Rule Representation (PRR), Semantics for Business Vocabularies and Rules (SBVR)
for rule interchange – draft mapping from SBVR to OWL via an ontology represented in
UML/ODM and OWL will be available as an IBM Technical Report shortly, with the mapping
ontology made open source on code.google. com
Copyright © 2013 Thematix Partners LLC
98 98
+ Next Steps at OMG

W3C is moving the ball forward on a number of relevant fronts: RDF 1.1
(simplification, bug fixing, provenance), OWL 2, etc.

Next edition of ODM (ODM 1.1) will support both the changes proposed
for RDF 1.1 (RDF Source, RDF Dataset, literal simplification, etc.,) and
OWL 2

RFP to support APIs for knowledge base access (API4KBs) – submissions
will reuse ODM RDF and OWL metamodels


Initial submission effort is underway

Focus is on an interoperable framework for integration of Semantic Web based
knowledge bases and reasoners with other applications including rule engines,
text and data mining, machine learning, analytics
Development of critical vocabularies as ontologies, including

Date Time Vocabulary – currently supports SBVR, OCL, CL/CLIF, and mapping
to OWL is in work

Information Exchange Framework (IEF) Packaging Policy Vocabulary (IEPPV)

Financial Industry Business Ontology (FIBO)
Copyright © 2013 Thematix Partners LLC
99 99
+ Relationship to Other Standards
100
100

Update to support OWL 2 is underway -- ODM 1.1 revision
planned for June 2013, including support for OWL 2 and RDF 1.1

CL Metamodel is identical to the UML diagrams in ISO 24707;
revision to support revisions to the ISO standard are in work,
mainly to module-related concepts

High degree of synergy between ODM and Topic Maps ISO
13250 working group

Current work in ISO JTC1 SC32 to update ISO 11179 (Metadata
Registration) references ODM; also addressing alignment with
SKOS (Simple Knowledge Organization System) and Dublin Core

All ODM metamodels are referenced and used in ISO CD 19763
(MMF – Metamodel Framework, Model Registry specification)
Copyright © 2013 Thematix Partners LLC
+ Additional Metadata Standards for Definitions &
Registries
101
101

A number of commercial and government taxonomies for
document-related asset management extend


Dublin Core Metadata Terms (DCMI, http://www.dublincore.org/ )
Simple Knowledge Organization System (W3C,
http://www.w3.org/TR/skos-reference/

Common approach to metadata is to reuse & extend these
vocabularies where applicable

Some (typically large) organizations also extend ISO 11179-3
Metadata Registry, ISO 19763 Model Registry standards as
appropriate

JPL’s planetary science data store (PDS) uses ISO 11179 Edition
2, may be updated to support emerging Edition 3, for example

OMG Common Terminology Services 2 (CTS2) uses a
combination of Dublin Core, SKOS, and ISO 11179 as a basis for
term registration (adopted June 2011)
Copyright © 2013 Thematix Partners LLC
+ Part 3: Putting It All Together

Syntax Checking

Consistency Checking

Instance Data Analysis & Evaluation Services

Example Applications

Demo – Visual Ontology Modeler (VOM)

Questions
Copyright © 2013 Thematix Partners LLC
102
102
+

Syntax checking
For RDF & OWL Ontologies


RDF syntax checking, graph visualization
•
W3C RDF Validator (http://www.w3.org/RDF/Validator/)
•
Jena API & Toolkit (http://jena.sourceforge.net/)
OWL syntax checking, OWL dialect
determination
•
OWL Consistency Checker
(http://clarkparsia.com/pellet/, also provides full OWL 2
DL reasoning capabilities)
•
OWL 2 Validator (Univ. of Manchester,
http://owl.cs.manchester.ac.uk/validator/ )
•
Protégé OWL (http://protege.stanford.edu/)
•
Jena API & Toolkit (http://jena.sourceforge.net/)

Every tool provides unique capabilities;
sophisticated projects may require multiple
approaches

Tools listed are open source, commercial
options are also available
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
103
103
+ Consistency checking and analysis


Requirements are typically application specific
RDF vocabularies should be checked by an RDF validator or rule engine



OWL Ontologies should be run through a consistency checking reasoner








Pellet (open source, originally from Mindswap, supported by Clark & Parsia, LLC) –
http://clarkparsia.com/pellet/,
RacerPro – http://www.racer-systems.com/
FaCT++ – http://owl.man.ac.uk/factplusplus/
HermiT OWL Reasoner – http://www.hermit-reasoner.com/
KAON2 infrastructure for managing OWL-DL, SWRL, and F-Logic ontologies –
http://kaon2.semanticweb.org/
VIStology's ConsVISor OWL Consistency checker – http://68.162.250.6:8080/consvisor/
OWL instance data may also be run through checking tools


Jena Semantic Web Framework - http://jena.sourceforge.net/
Pychinko, MindSwap’s Rete-based RDF-friendly rule engine – (CWM clone)
http://www.mindswap.org/~katz/pychinko/
TW Instance data evaluation - http://onto.rpi.edu/demo/oie/
OOPS! (OntOlogy Pitfall Scanner!) ontology engineering tool available from the
School of Computer Science at Universidad Politécnica de Madrid, UPM -http://oeg-lia3.dia.fi.upm.es/oops/index-content.jsp can also provide helpful hints
OntoClean methodology for ontology analysis – see
http://en.wikipedia.org/wiki/OntoClean for links to the seminal papers on this
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
104
104
+ Example process
105
105
Pellet OWL DL Reasoner
Valid OWL
(DL consistency checked, concept
satisfiability checked)
TW Instance Data
Evaluation
Valid OWL
(FOL consistency checked,
deductive closure)
Ontology Registry
& Library
Valid RDF/XML Documents
(Valid OWL Syntax Representation)
Ontologies should be checked to ensure consistency,
limit potential for invalid conclusions
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
Application Environment
+ Inconsistencies caused by distributed OWL data
106
106
Wine ontology
wine:Wine
rdfs:subClassOf
rdfs:subClassOf
wine:EarlyHarvest
owl:disjointWith
A new ontology
rdfs:subClassOf
wine:LateHarvest
New instance data
rdfs:subClassOf
wine:BadWineClass
Case1: semantic inconsistency
caused by a new class.
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
rdf:type
rdf:type
wi:BadWineInstance
Case2: semantic inconsistency
caused by a new instance.
+ TW OWL instance-data evaluation

Motivation



Challenges



Criteria for “ready for use” vary across applications
Existing syntax and logical-consistency checking is not enough for
applications using some standard practical assumptions (e.g.
closed world, unique names, …)
Solution



Objective metrics are needed to evaluate if semantic web instance
data is ready for practical use
Next generation Chimaera (Stanford Knowledge Systems
Laboratory tool, late 1990’s – early 2000’s) for more diverse and
more instance-oriented applications
Extensible & customizable service-oriented architecture
Integrated environment for evaluating issues beyond standard
syntactic correctness and logical consistency
See http://onto.rpi.edu/demo/oie/
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
107
107
+ Semantically-Enabled Business Applications
Use Cases
• Call Center Operations to identify conflicting Service Advisories
• Intelligence Analysis – supporting research operations
• Semantically enhanced Fraud Detection
• IP Content Publication & Management for Media
108
108
<<ontologyClass>>
procedure
Semantically-Enhanced Search / Retrieval
Publish / Subscribe, Agent-Driven User Access
Preference / Role-based Custom Delivery
<<ontologyClass>>
Algorithm
<<ontologyClass>>
Congestion Control Algorithm
<<ontologyClass>>
Forwarding Algorithm
<<ontologyClass>> <<ontologyClass>> <<ontologyClass>><<ontologyClass>><<ontologyClass>>
<<ontologyClass>>
<<ontologyClass>> <<ontologyClass>>
TCP New Reno TCP Westwood Flow Control AlgorithmEqual Cost Multipath Random
Weighted Random
Binary Increase Congestion (BIC)
High Speed TCP (HSTCP)
<<ontologyClass>>
Load Balancing Algorithm
<<ontologyClass>>
<<ontologyClass>>
Least Connection Scheduling
Shortest Expected Delay Scheduling
<<ontologyClass>>
Quality of Service (QoS) Algorithm
<<ontologyClass>>
<<ontologyClass>>
<<ontologyClass>>
Class Based Queueing
Hierarchical Token Bucket Filtering
Random Early Drop
<<ontologyClass>>
<<ontologyClass>>
<<ontologyClass>>
<<ontologyClass>> <<ontologyClass>>
<<ontologyClass>> <<ontologyClass>>
<<ontologyClass>><<ontologyClass>>
<<ontologyClass>>
<<ontologyClass>>
Differentiated Services
Explicit Congestion Notification TCP Hybla
TCP Vegas
Standard TCP Transactional TCP Interface Round RobinRound Robin Weighted Round Robin
Locality Based Least Connection Scheduling
Weighted Least Connection Scheduling
<<ontologyClass>>
Token Bucket Filtering
<<ontologyClass>>
<<ontologyClass>>
Priority Based Queueing
Stochastic Fair Queueing
<<ontologyClass>>
TCP Westwood Plus
Document Mining & Extraction Service
Reference Vocabulary Drives Extraction
- UIMA-based services (Open Source)
- TIBCO Spotfire, Inxight, others
Valid Reference Knowledge Base
Domain Vocabulary (Ontology Components)
+ Reference Data
Document Repository
Schema & Index
Document Content Mention /
Cross-Reference
Policy KB
Archive
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
Document
Repository
Organization and Management of Data
to Support Automation
109
109
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
+
Model Driven Service Provisioning
GE Water and Process – InSight Remote Monitoring & Diagnostics application

Cross-industry – Beverage, Chemical Processing, Commercial, Food, General Industrial,
Hydrocarbon Processing, Life Sciences, Medical Dialysis, Microelectronics, Mining &
Mineral Processing, Municipal, Pharmaceutical, Power, Primary Metals, Residential,
Transportation

Cross-domain – Boiler Water Treatment, Cooling Water Treatment, Feedstock Flexibility
Solutions, Fuel Flexibility Solutions, Mobile Water, Petrochemical Process Solutions,
Process Separations, Water Recovery, Water Scarcity Relief

Common hardware, software, process foundational components

Common goals for service offerings, provisioning, reporting across industries & solutions –
what was done and why, how the services were accomplished, how did the customer
benefit, metrics

Solution includes conceptual information models representing concepts, services,
processes, relationships across them, large domain-specific knowledge bases, webbased service deployment & data management

Assists GE customers in reducing water, energy & labor costs, improves water and
process treatment programs

Within 6 months of deployment – over 200+ plants were licensed users
Copyright © 2013 Thematix Partners LLC & McGuinness Associates
110
110
+ Visual Ontology Modeler (VOM) Demo
111
111
Copyright © 2013 Thematix Partners LLC
112
112
Visual Ontology Modeling is a
new and valuable method for the
development of commercial and
scientific ontologies.
Multiple diagrammatic “points of
view” simplify understanding and
maintenance of large, complex
ontologies.
It bridges the gap between the
Semantic Web and the world of
UML modeling.
Copyright © 2013 Thematix Partners LLC
113
113
+ Acronym Soup

CL – ISO 24707 Common Logic: a family of first order logic languages,
including Conceptual Graphs & Common Logic Interchange Format – a
successor to the Knowledge Interchange Format (KIF), http://cl.tamu.edu/

DAML – DARPA Agent Mark-up Language, one of the primary languages
leading to the development of OWL, http://www.daml.org/

DARPA – Defense Advanced Research Projects Agency,
http://www.darpa.mil/

DL – Description Logics: a subset of first order logic, for which tractable &
complete reasoning systems are available

Linked Data and RDFa, http://www.w3.org/standards/techs/rdfa#w3c_all

MDA – Model-Driven Architecture, http://www.omg.org/mda/

ODM – Ontology Definition Metamodel, http://www.omg.org/spec/ODM/1.0/

OWL – W3C Web Ontology Language, OWL 2 specifications,
http://www.w3.org/standards/techs/owl#w3c_all

OWL-S – a set of OWL ontology components that extend the W3C OWL
specifications to support Semantic Web Services,
http://www.daml.org/services/owl-s/1.1/
Copyright © 2013 Thematix Partners LLC
114
114
+More Acronym Soup

115
115
PML – Proof Markup Language – Provenance Interlingua, www.inference-web.org

PRR – Production Rules Representation, http://www.omg.org/spec/PRR/1.0/

RIF – Rule Interchange Format, http://www.w3.org/standards/techs/rif#w3c_all

RDF – Resource Description Framework, http://www.w3.org/TR/rdf-concepts/

Semantic Annotations for WSDL and XML (SAWSDL),
http://www.w3.org/standards/techs/sawsdl#w3c_all

Simple Knowledge Organization System (SKOS),
http://www.w3.org/standards/techs/skos#w3c_all

SPARQL – Semantic Web Query Language,
http://www.w3.org/standards/techs/sparql#w3c_all

Semantic Web Rule Language (SWRL), http://www.w3.org/Submission/SWRL/
Copyright © 2013 Thematix Partners LLC
Download