Sharing our understanding : ontology is useful, but we need more…

Sharing our understanding:
ontology is useful, but we need
more…
Mark Gahegan
GeoVISTA Center, Penn State University, USA
Credits
GeoVISTA Center, Penn State:
(GEON, HERO, Dialog-Plus)
Junyan Luo
Bill Pike, (Pacific Northwest National Labs)
Boyan Brodaric (Geological Survey Canada)
Tawan Banchuen
A-J Jaiswal
Kean-Huat Soon
Steve Weaver
The Geosciences Network (GEON):
www.geongrid.org
Hypothesis: We are relying too much on
ontology as our ‘carrier of meaning’…
• “You can know the name of a bird in all the
languages of the world, but when you're
finished, you'll know absolutely nothing
whatever about the bird... So let's look at the
bird and see what it's doing -- that's what
counts.”
-- Richard Feynman
• Ontology tells us what is known, but epistemology
considers how it is known, how it came to be, why it
came to be the way it is (and not some other way),
how it is used, why it is used…
What kind of science record are we
currently leaving behind?
• Can we understand our products into the future?
The “knowledge as fish” problem
– 2 days
– 2 months?
– 20 years?
• Can other researchers understand them?
– Often, no!
– What are the problems they face?
• Changing taxonomies, non-commensurate semantics
• Changing methods, workflows
• Changing practices, individuals
Where does meaning come
from?
• The concepts & relations we use to describe the world
do not exist in the world—we create them.
• How we do this—create meaning out of data—is often
unrecorded…
– for lack of a model of the scientific process that can
capture knowledge as it is created and used.
• We need an approach to representing scientific
concepts that reflects:
– the situated processes of science work,
– the social construction of knowledge, and
– the evolution of knowledge over time.
• In this model, knowledge is the result of investigation,
negotiation, and collaboration by teams of
researchers.
Where does meaning come
from?
• We ‘know’ things in many ways:
– Theoretical, Experiential, Procedural
• Domain understanding / theory (ontology)
• The way science is done (epistemology)
– How are resources created and used (work practices
/ situations)?
• Social negotiation among the community of
users (social network, group cognition)
• i.e. the interplay of top-down and bottom-up
knowledge played out in private and social
situations
“Knowledge soup” – Sowa, 2002
The world is complex, our
understanding is complex also
• “Human knowledge is a process of approximation. In the focus of
experience, there is comparative clarity. But the discrimination of this
clarity leads into the penumbral background. There are always
questions left over. The problem is to discriminate exactly what we
know vaguely”.
Alfred North Whitehead, Essays in Science and Philosophy
• “Get rid, thoughtful Reader, of the Okhamistic prejudice of political
partisanship that in thought, in being, and in development the
indefinite is due to a degeneration from a primal state of perfect
definiteness. The truth is rather on the side of the Scholastic realists
that the unsettled is the primal state, and that definiteness and
determinateness, the two poles of settledness, are, in the large,
approximations, developmentally, epistemologically, and
metaphysically.”
C.S. Peirce
P. J. Braspenning: Symposium on Intelligent Agents in Software Engineering for Planning, KaHo St.-Lieven, Gent, 23rd February, 2000
Meaning and language
(from P. J. Braspenning)
• Many people assume that conveying understanding is a merely
linguistic problem that would not occur in a purified language like
description logic (ontology).
– Yet that assumption is wrong. Most of our problems are caused by the
complexity of the world itself, and our uneven experiences of it.
• The knowledge soup has a loose organization characterized by
the “disorder” and the “leftover questions”
– The problem is to “discriminate exactly what we know vaguely”
– The task therefore is to make “little bits of order” that organize, interpret, and
give meaning to the disorder.
• For both of them, language is a tool for discriminating
[DISCOVERING] and creating [INVENTING] structure out of the
primordial knowledge soup.
– i.e. Emergence and Imposition
• This structure is essential for precise reasoning, and any
reasoning system – human or artificial – must either find structure
in the soup or create structure that can provide, in Peirce’s terms:
“a solid foundation for great and weighty thought.”
P. J. Braspenning: Symposium on Intelligent Agents in Software Engineering for Planning, KaHo St.-Lieven, Gent, 23rd February, 2000
What’s in the soup? A nexus of
knowledge structures (Whitehead, 1923)
Describing our resources: Options
(Do nothing... It’s a hard problem, after all)
1. Build community ontologies—register to these
2. Allow user to describe—build very good
matching tools!
3. Infer from usage patterns—needs data mining
technology
4. Infer from workflows used to create the
resource—needs workflow capture &
representation
Why ontologies?
(Noy and McGuinness)
• To share common understanding of the
structure of information among people or
software agents
• To enable reuse of domain knowledge
• To make domain assumptions explicit
• To automatically integrate disparate
databases…
Rock Taxonomy
(ontologically based)
Geological taxonomy
converted to an ontology
Gathered from experts
during a specially convened
workshop
Formalizes relationships
between concepts
Ontology: the fine print
1.
2.
3.
Lack of working processes by which to capture ontological knowledge effectively
Lack of tools to adequately convey ontological meaning to others (and to ourselves)
Lack of useful matching measures to show how close ontologically one concept is to
another, and hence problems of interpretation for the user
–
4.
5.
Lack of mechanisms to achieve community consensus & manage ontology releases
The world is in constant flux, as is our own understanding of the world, but our
ontologies are static:
–
–
6.
Can we support versioning, revision, refinement
How do we do so WITHIN A KNOWLEDGE COMMUNITY?
The world is very complex:
–
7.
By taking what is essentially a reductionist approach, we remove potentially important variance
There are no natural or given categories in nature—we invent & impose them all:
–
–
–
8.
No rose, no red, no love, no like—no colours, tastes, no forests, no oceans, no granite outcrops
but we invent them differently from one another, and we differ in how we understand or differentiate
them,
we may even be inconsistent with ourselves over time
What, then, carries meaning? Is it these ontological labels? Or the processes by which
we arrived at them (epistemology)? Or the ways they are used in practice (pragmatics?)
–
–
9.
Cf map accuracy: features on a map, is my map fit for use?
I would strongly argue that it is mostly the latter two, things are what they are because of processes
we have developed to isolate and label them—either with computers or with our own heads, and
protocols for matching / comparing
Problem is we are not very good at describing said processes
How do we retrospectively tag all the millions of data resources we already have?
Remembering situations
Situations
Creation
Application
Represented by
Who did it?
Who should use it?
Collections of people
Where was it made?
Where does it apply?
Collections of sites /
scales
When was it made?
When does it apply?
Collections of
temporal intervals
How was it made?
How should it be
used?
Collections of
methods and data
Why was it made?
Why should it be
used?
Collections of
research questions,
motivations, theories
What is in an e-Science Knowledge
Soup?
People
–
–
–
–
–
–
PIs and CoPIs
Contributors
Developers
Users
Knowledge engineers
Reviewers
Organizations
– Sponsors
– Participants
– Hosts
Resources
–
–
–
–
–
–
Datasets
Methods
Workflows
Ontologies / Vocabularies
Articles / Reports
‘Signifiers’
Tools
– Concept browsing / mapping
– Concept matching
– Knowledge engineering
Codex: Situating resources in
Whitehead’s Nexus
Perspectives as filters
Perspectives filter an information space according to
particular situations.
Perspectives A and B
preferentially select different types of resources and
relations; the ability to view perspectives can show
how someone else made sense of a given set of
resources.
Four perspectives on a “seismic velocity” concept (red node). a) Intensional concept structure. b)
A task that describes how seismic velocity can be measured. c) A social network built around
users of the concept. d) Data resources that have been used to describe seismic velocity.
Concept use and evolution
Evolution of “Depositional environment” concept through use by different researchers over time,
progressing from upper left to lower right.
ConceptVista (CV4): Navigating
through GEON’s conceptual universe
Some GEON themes
GEON data formats and ESRI
ShapeFile instances
GEON: Institutions, Personnel, PIs,
Co-PIs, grad students
Combining perspectives: e.g. GEON
institutions, publications and personnel
Articles, authors, readers,
keywords, themes
Intersecting research interests
What did A create that B used?
User’s interests vs. declared themes?
Can we do more?
• Can we still have ontology, but with
perspectives, to map to specific ways of thinking
/ specific tasks?
• Perspectives allow us to ‘discriminate clarity
from the penumbral background’ (after
Whitehead)
• Knowledge horizons: an idea from Hermeneutics
– Useful for working out if mappings between
ontologies / perspectives are possible
– Creating flexible horizons
– Relations become properties (internalized), properties
become relations (externalized)
– Perspectives can be applied locally or globally
Perspectives: folding in, folding out
Fold in
n
o
m
country
content
p
user B
user A
user C
image
image
Properties
Properties
Date: ddmmyyyy
Date: ddmmyyyy
Scale: 1:xxxxxxxx
User: (A, B, C)
Scale: 1:xxxxxxxx
Fold out
Horizons--hermeneutics
Country: “………..”
Content: (m, n, o, p)
Perspectives in CV4: Person as
Contributor…
Perspectives inCV4: Person as
User
Hermeneutics
• Any knowledge fragment that we represent may only be
the tip of the iceberg, in terms of meaning.
• “How much is that Doggie in the window?”
– This phrase can be perfectly represented in an ontology or set of
relations
Some possible meanings?
doggie
in
window
How much
cost?
I want to buy the doggie in the window (face value)
I want to buy a dog (generalization)
I am researching the cost of pet ownership (context
missing—but still face value)
I no longer want to buy a hamster (emphasis)
I want you to think I want a dog (deception)
I want to let you know the dog is not in its cage
(cryptic)
I want to convey that I understand that I must pay
for the dog (meta-level communication)
Horizons
• The connection between intended meaning and the
phrases used may not be straightforward
– Two communicators share a background context that is unknown
to an eaves-dropper (the first and second share a horizon that is
not shared by a third)
– Therefore, much knowledge in a computer is semantics without
a role, without a horizon…
• Intended and possible meanings may need to be
carefully represented along with knowledge fragments
– Knowledge roles drive at this problem to some extent
– But knowledge roles may be unknown (unless you define them)
Knowledge Perspectives & Horizons in
a systems context
Knowledge
Perspective
Knowledge
Horizon
Concentric knowledge Concentric knowledge
Overlapping knowledge
Overlapping knowledge
Overlapping knowledge
Other is partly beyond Common knowledge horizon Overlapping knowledge Overlapping knowledge horizons
knowledge horizon
horizons
Other is partly beyond
knowledge horizon
Disjoint knowledge
Overlapping knowledge
horizons
Disjoint knowledge
Overlapping knowledge horizons
Other is partly within knowledge
horizon
Disjoint knowledge
Overlapping knowledge horizons
Other is beyond knowledge horizon
Surveyor. © 2002-6 by Infomaniacs/Neological. All Rights Reserved. VonSchweber
Disjoint knowledge
Disjoint knowledge horizons
40
CV4 can now browse large
concept universes (e.g. OpenCyc)
WordNet RMI Service
•
•
•
•
Remote Service (RMI).
Implements caching mechanism.
Hides underlying architecture from clients.
Supported operations:
•
•
•
•
Synonyms, Antonyms
Hypernyms, Hyponyms.
Meronyms, Holonyms.
Definitions.
Architecture
Communication Medium
(Network/ Internet)
Client
Application
Word
WordNet Server
Word
Model
Word
Word
Model
Word Not Present
Word Model Creator
Index Searcher
Word Model
WordNet API
Indexer
Check
word
Retrieve
Word
Model
Write
WordNet Database
Client Side
Server Side
Lucene Index
Web Browser Integration
• JDesktop Integration Components (JDIC)
• Enables embedding a native web browser
with JAVA applications.
• Supported browsers
• Internet Explorer
• Mozilla
• Support for Highlighting words in HTML
Pages.
Web Page Highlight Mechanism
JPanel Application
Word
Inputs
Highlight Translation
Button
On
Click
Rebuild Relative URLs in Web
page
Web Browser (Native Internet Explorer)
Cache rebuilded Page to Disk
Page is Highlighted =
true
Reload Browser
AND
On Click Highlighted
Word
Thread Scheduler
Start
Get Word Translations
HTTP Server
Highlight cached Page and
write to disk.
Decode Data
Fire Event for
Application
WordNet RMI Service
Server
Webpage Markup
Concept Mapping : Formal Ontology ↔
Informal Ontology
Matching
result
• String
• Nym service
• By types & Values
Formal Ontology represents mutual
agreement
Informal Ontology represents individual
beliefs
Merged Ontology
Concept Mapping Wizard
Step 2 (option 1)
Formal
ontology
selection
Extracted from a
formatted text:
Ontology name,
URL, and
description
Informal
ontology
selection
Step 1
Step 2 (option 2)
mapping
approaches
Matching imprecise / vague /
informal concepts…
Using WordNet / formal ontologies to match ontologies (i.e.
informal and formal), in addition to the exact & partial string
matching
Based on linguistic/semantic relationships, such as:
antonyms, synonyms, hyponyms, meronyms, holonyms, …
Ontology 1
concept3
conpt3
conpt1
concept1
conpt2
concept2
Dew Point
Temperature
Ontology 4
Ontology 2
Temperature
cpt1
Physical Property
copt1
Ontology 3
copt2
cpt3
cpt2
Matching wizard
Matching based on properties
Summary
• Rich, Living Knowledge
– “Knowledge keeps no better than fish”
-- Alfred North Whitehead
– “You cannot put your foot in the same stream twice”
-- Heraclitus
– “…So let's look at the bird and see what it's doing -- that's
what counts.”
-- Richard Feynman
• Perspectives allow scientists to ‘describe what they
know’ onto shared ontological resources.
• Irony of Ontology is that ontologically-based
languages can be used to represent its obverse—
Epistemology.
End
Questions?
www.geovista.psu.edu/conceptvista
William Pike’s PhD Dissertation: online
dissertation library at Penn State
e-Learning objects
Learning
Approach
Subject (GPS)
Interactions
Learning
Activity
Tasks
Outcomes
Current work: integrating data analysis
and concepts in a single system
Workshop Questions
• What are effective carriers of meaning
(between researchers) in the geosciences?
• A field trip? A photograph? Text? Article? The same
for us all?
– Can these carriers be represented / emulated in
systems?
– How are they best represented / signified?
– Are they top down, or bottom up, or both?