S K C November 1997

advertisement

7/26/2016

S calable K nowledge C omposition November 1997 Jan Jannink, Danladi Verheijen, Gio Wiederhold Stanford University

An abstract concept is like a valise with a false bottom. you may put in what you please, and take them out again, without being observed.

Alexis de Toqueville, Democracy in America, 1838.

Gio Wiederhold SKC 1

SKC Progress Report

Goal: Reliable answers using heterogeneous

data sources

General sources: factbook ‘96, UNTopical sources: EIA, OECD, OPECApproach: Bottom up from dataPython scripts implement rule-based

operations on source data to answer challenge problems

Theory: Rule-based algebraMapping primitives & Intersection operation 7/26/2016 Gio Wiederhold SKC 2

Web as Source of Ontology

We extract portions of ontology implicit in sitesfactbook ‘96: www.odci.gov/ciaUN: www.un.org & www.globalpolicy.orgEIA: www.eia.doe.govOECD: www.oecd.orgOPEC: www.opec.org 7/26/2016 Gio Wiederhold SKC 3

Example Query

What is the most recent

year an OPEC member nation was on the UN security council?

Related to CP # 72Sources »

factbook ‘96 (nation)

» OPEC (members,

dates)

» UN (SC members,

years)

Correct Answer » 1996 (Indonesia)Problems * *

different country names

Gambia => The Gambia

historical country names

Yugoslavia *

factbook has out of date OPEC & UN SC lists

Gabon (left OPEC 1994) » UN lists future security

council members

Gabon 1998 » intent of original questionTemporal variants 7/26/2016 Gio Wiederhold SKC 4

Partial Query Data

7/26/2016 Source: OPEC Pages Iran Iraq Kuwait Saudi_Arabia Venezuela Qatar Indonesia Socialist_Peoples_Libyan _Arab_Jamahiriya * United_Arab_Emirates Algeria Nigeria Ecuador Gabon * 1960 1960 1960 1960 1960 1961 1962 1962 1967 1969 1971 1973 1992 1975 1994 UN Pages Bahrain Bahrain Brazil Brazil Gabon * Gabon * Gambia * Gambia * Slovenia Slovenia Costa_Rica Costa_Rica … Indonesia Indonesia … Yugoslavia * Yugoslavia * 1996 1995 1989 1988 1999 1998 1999 1998 1999 1998 1999 1998 1999 1998 1998 1997 * Problems handled using SKC articulation rules Gio Wiederhold SKC 5

Current Directions

Experience w/ real data confirming validity of

our approach

Expert sources are better maintained than

general sources

We generate successive approximations with

increasing levels of confidence

Manual processing of sources is our first step in

providing an algebra that truly accounts for the complexity of real data sources

7/26/2016 Gio Wiederhold SKC 6

Ontology?

Ontologies list the terms and their relationships that allow communication among partners in enterprises

(in machine-readable form) Relationships determine meaning -

parent, school, company Databases use ontologies during design in their E-R diagrams

(Implicitly)

and represent the leaf nodes in their schemas Knowledge-bases use ontologies

(often implicitely)

add class definition

(to hold instances)

, constraints, and operations among the terms

7/26/2016 Gio Wiederhold SKC 7

Functions of Ontologies

Define Terms used in System Construction

to enable Correctness in Understanding system = designers, implementors, users, maintainers designers = implementors = users = maintainers

Define Higher-level Abstractions needed to

communicate in larger contexts managers, decision-makers, systems in own, other domains

Share the Cost of Knowledge Acquistion &

Maintenance reuse encoded knowledge, remain up-to-date as domains change

7/26/2016 Gio Wiederhold SKC 8

Ancestors of Ontologies

Lexicons: collect terms used in inform. systems

Taxonomies: categorize, abstract, classify terms

Schemas of databases: attributes, ranges filed

Data dictionaries: integration of files, attributes

Object libraries: grouped attributes, methods

Symbol tables: collect terms used in a program

Domain object models: re-engineering terms

. . .

More Knowledge

7/26/2016 Gio Wiederhold SKC 9

Establishing Ontologies

Top-down:

Commonly acceptable UPPER layers

Domain-specific

Sharing toolsObject based

Bottom-up

Pragmatic, TASK-specific collectionsDatabase schemas and models 7/26/2016 Gio Wiederhold SKC 10

Ontologies in Use

Implicit ontologies are a prerequisite for communication among humans and organizations.

Knowledge is explicitely represented in AI-systems; sometimes the ontology is explicit as well.

Database schemas are partial explicit ontologies

Relational schemas only terms & 1:1 dependencies.E-R designs contain 1:n, m:n cardinalitiesStructural schemas contain semantic dep. types

Conceptual graphs define terms of discourse and a modest number of relationship types Variables in software represent ontologies poorly.

7/26/2016 Gio Wiederhold SKC 11

Ontology Sharing

Three Alternatives

Create a committee to define everybody’s terms

Takes many years, until people are worn outIgnored when changes make deviation necessaryGet all terms and put them into large model

[ Cyc, UMLS, Federated Schemas, . . . ]

Can be rapid Provides broad integration Ignores conflictsHard to maintain (requires committee)

Keep all Terms distinct, except where sharing

Requires initial effort Complex system view Empowers participantsScalable with many participants 7/26/2016 Gio Wiederhold SKC 12

SKC Objective

Provide for Maintainable Ontologies

devolve maintenance onto many

domain-specific experts / authorities

provide an

algebra

to compute composed ontologies that are limited to their articulation terms SKC

enable interpretation within the

source contexts

7/26/2016 Gio Wiederhold SKC 13

SKC Working Definitions

Ontology:

a set of terms and their relationships

Term:

a reference to real-world and abstract objects

Relationship:

a named and typed set of links between objects

Reference:

a label that names objects

Real-world object:

an entity instance with a physical manifestation

Abstract object:

a concept which refers to other objects

7/26/2016 Gio Wiederhold SKC 14

We Consider as Ontologies:

Object oriented class hierarchies,

(snapshots of executing programs capture object instances)

Database schemas,

(via their E-R or structural models)

Semi-structured databases,

(OEM )

Definitional thesauri,

(UMLS: see http://www.lexical.com) Knowledge bases.(CYC, Ontolingua) SKC specifically does not restrict its applicability to a purely extensional (object) or intensional (schema) definition of ontology, since its purpose is to support useful processing of extensions using intensional knowledge for all parties. To that end it is important that the intensional specifications include predicates or methods that permit the collection of extensional access to real-world objects. We do not require ontologies to be complete specifications of a domain, but rather that usage of an ontology provide results complete with respect to the ontology.

7/26/2016 Gio Wiederhold SKC 15

Aspects that Focus SKC

The mapping of terms to objects differs between

autonomous domains.

The collections of real-world objects provides a

grounding for the definitions, and an opportunity for validation of the meaning of the terms being employed.:

Relationships have semantic, and derived from

that, structural significance. Multiple relationship types may share structural characteristics, as IS-A, Ownership, Part-of, Reference,

We will keep the number of primitive

relationships limited,

The mapping of relationship types differs

between autonomous domains.

7/26/2016 Gio Wiederhold SKC 16

Domains and Consistency

a domain will contain many objectsthe object configuration is consistent • • within a domain all

terms

are consistent &

relationships

among objects are consistent

Domain Ontology

context is implicit

No committee is needed to forge compromises * within a domain

Compromises hide valuable details

7/26/2016 Gio Wiederhold SKC 17

Domain Heterogeneity

If interoperation involves distinct domains, mismatch ensues

Autonomy conflicts with consistency, Local Needs have Priority,Outside uses are a Byproduct

Heterogeneity must be addressed

Platform and Operating Systems 4 4 • Representation and Access Conventions 4 • Naming and Ontology : 7/26/2016 Gio Wiederhold SKC 18

An Ontology Algebra

A knowledge-based algebra for ontologies Intersection create a subset ontology keep sharable entries Union create a joint ontology merge entries Difference create a distinct ontology remove shared entries The Articulation Ontology (AO) consists of matching rules that link domain ontologies

7/26/2016 Gio Wiederhold SKC 19

Features of an Algebra

Operations can be composedOperations can be rearrangedAlternate arrangements can be evaluatedOptimization is enabledThe record of past operations can be

kept and reused

7/26/2016 Gio Wiederhold SKC 20

INTERSECTION Operation

Result contains shared terms

Terms useful for purchasing

Source Domain 1: Owned and maintained by Store

7/26/2016

Source Domain 2: Owned and maintained by Factory

Gio Wiederhold SKC 21

INTERSECTION Support

Articulation ontology Matching rules that use terms from the 2 source domains

Terms useful for purchasing

7/26/2016

Store Ontology Factory Ontology

Gio Wiederhold SKC 22

Sample Intersections

Articulation ontology matching rules :

size = size color =table(colcode) style = style Ana tomy {. . . } Shoe Store

Shoes { . . . }Customers { . . . }Employees { . . . }

Shoe Factory

Material inventory {...}Employees { . . . }Machinery { . . . }Processes { . . . }Shoes { . . . }

Hard ware foot = foot Employees Employees

7/26/2016 Gio Wiederhold SKC 23

Other Basic Operations

UNION: merging

entire ontologies

Arti culation ontology DIFFERENCE: material

fully under local control

7/26/2016

typically prior intersections

Gio Wiederhold SKC 24

Knowledge Composition

Legend:

U U : union

: intersection

Articulation knowledge U for ( ( A B ( U C U B ) C ) E ) U U Composed knowledge for applications using A,B,C,E Articulation knowledge U ( C E )

7/26/2016

Articulation knowledge U for ( A B ) Knowledge resource A ( B U C ) Knowledge resource B Knowledge resource C Knowledge resource E ( C U D ) Knowledge resource D

Gio Wiederhold SKC 25

Exploiting the Result

Result has links to source

7/26/2016

Processing and evaluation is best performed within Source Domains

Gio Wiederhold SKC 26

Innovation in SKC

No need to harmonize full ontologiesFocus on what is critical for interoperationRules specific for articulationPotentially many sets of articulation rulesMaintenance is distributedto n sourcesto m articulation agents

is m < n

2

, depending on architecture density

a research question

7/26/2016 Gio Wiederhold SKC 27

Domain Specialization

Knowledge Acquisition (20% effort) &Knowledge Maintenance (80% effort *)Performed by: – Domain specialists – Professional organizations – Modest sized field teams

automously maintainable

Empowerment

7/26/2016

* based on software maintenance experience

Gio Wiederhold SKC 28

Summary

Algebra enables Interoperation bydealing explicitly with differences by knowledgeidentifying maintenance domainskeeping sources autonomousAssumes domain has a common ontologycomposing domain ontologies requires the algebra

to manage the linkages where articulation occurs

processes are best executed within the domainsArticulation knowledge is distributedallows specialists to work independentlysupports multiple intersections and views Maintenance is structured and partitioned 7/26/2016 Gio Wiederhold SKC 29

Download