The Open World Assumption or Nick Drummond, Rob Shearer

advertisement
The Open World Assumption
or
Sometimes its nice to know what we don’t
know
Nick Drummond, Rob Shearer
© 2006, The University of Manchester
About us
‣ Rob Shearer
‣ Automated reasoning for description logics
‣ Scalability of reasoning to very large ABox fact sets
‣ Integration of relational database technology with
DL automated reasoning techniques
‣ UoM Information Management Group
‣ Nick Drummond
‣ OWL and DL knowledge modeling
‣ Ontology authoring tools (Protégé)
‣ UoM Bio-Health Informatics Group
© 2006, The University of Manchester
2
Lots of relevant issues...
‣ Data versus knowledge
‣ Formal metadata semantics versus natural
language or ad-hoc annotations
‣ Automated reasoning versus customengineered solutions
‣ Model checking versus consistency checking
‣ Open-world versus closed-world interpretation
© 2006, The University of Manchester
3
Data versus knowledge
‣ Data needs to be interpreted to derive meaning
(knowledge)
‣ The same data can be interpreted many
different ways with different semantics
© 2006, The University of Manchester
4
Data versus knowledge
‣ Data needs to be interpreted to derive meaning
(knowledge)
‣ The same data can be interpreted many
different ways with different semantics
Rob
Nick
29
85
© 2006, The University of Manchester
5
Data versus knowledge
‣ Data needs to be interpreted to derive meaning
(knowledge)
‣ The same data can be interpreted many
different ways with different semantics
Name
Age
Rob
Nick
29
85
© 2006, The University of Manchester
6
Data versus knowledge
‣ Data needs to be interpreted to derive meaning
(knowledge)
‣ The same data can be interpreted many
different ways with different semantics
Name
Age
Rob
Nick
29
85
Manchester researchers
© 2006, The University of Manchester
7
Data versus knowledge
‣ Data needs to be interpreted to derive meaning
(knowledge)
‣ The same data can be interpreted many
different ways with different semantics
Name
Age
Rob
Nick
29
85
All Manchester researchers
© 2006, The University of Manchester
8
Data versus knowledge
‣ Data needs to be interpreted to derive meaning
(knowledge)
‣ The same data can be interpreted many
different ways with different semantics
Name
Age
Rob
Nick
29
85
All Manchester researchers at this workshop
© 2006, The University of Manchester
9
Data versus knowledge
‣ Data needs to be interpreted to derive meaning
(knowledge)
‣ The same data can be interpreted many
different ways with different semantics
Name
Age
Rob
Nick
29
85
All Manchester researchers at this workshop who are giving presentations
© 2006, The University of Manchester
10
Data versus knowledge
‣ Data needs to be interpreted to derive meaning
(knowledge)
‣ The same data can be interpreted many
different ways with different semantics
Name
Age
Rob
Nick
29
85
Some Manchester researchers
© 2006, The University of Manchester
11
Data versus knowledge
‣ Data needs to be interpreted to derive meaning
(knowledge)
‣ The same data can be interpreted many
different ways with different semantics
‣ Interpretation is often performed through
querying and result-set processing
‣ Data is (hopefully!) encoded with respect to a
particular interpretation
‣ Even in databases, the encoding interpretation
is often not “closed-world”
‣ Lots of KR formalisms for recording metadata
© 2006, The University of Manchester
12
Interpretation implementation
‣ Clear definitions of interpretation semantics are
Good Things
‣ Formal knowledge representations have wellunderstood semantics and provide
unambiguous interpretations
‣ Data can be interpreted with respect to KR
metadata by existing general-purpose tools
‣ Custom code is required to interpret naturallanguage or ad-hoc metadata
© 2006, The University of Manchester
13
Model checking versus
consistency checking
‣ Different tools do different jobs!
‣ Integrity constraint and RDB query answering
systems check whether a single model defined
by the data satisfies some criteria
‣ “Reasoning systems” consider a space of
“possible models”
‣ Facts or instance data are constraints which must
be true in all models
‣ Additional semantic restrictions (axioms) constrain
the set of models
Note that the terms “constraints” and “restrictions” are
used ambiguously
© 2006, The University of Manchester
14
Open world versus closed world
‣ Open-world interpretation
‣ If fact C is true in every model of KB, then C is a
consequence of KB: KB ⊨ C
‣ If C is true in no model of KB, then its negation is
true in every model of KB: KB ⊨ ¬C
‣ If a C is true in some models but false in others,
neither C nor its negation is a consequence of KB:
KB ⊭ C; KB ⊭ ¬C
‣ Closed-world interpretation
‣ If a fact C is not a consequence of KB, assume that
its negation is a consequence of KB
‣ KB ⊭ C implies KB ⊨ ¬C
© 2006, The University of Manchester
15
In the beginning...
‣ Closed World Systems require a place to put
everything
‣ You can’t say anything until there’s somewhere
to say it
‣ Slot on a frame, field on an OO class, column in a
DB
‣ We state what is possible
© 2006, The University of Manchester
16
In the beginning...
‣ When we have an empty
OWL ontology, everything is
possible
‣ We then constrain an
ontology iteratively, making
it more restrictive as we go
‣ We state what is not
possible
Pig → Animal and (hasLimbs only Leg)
© 2006, The University of Manchester
17
Negation as Failure (NaF)
Animal
Can Fly?
Penguin
No
Shark
No
Hummingbird
Yes
‣ Can pigs fly?
‣ In CWA, because the data doesn’t contain this fact, we assume
false
‣ In the OWA, unless we have a statement (or we can infer) “pigs
can/cannot fly” we return “don’t know”
‣ NaF - only false if “not(pigs can fly)”
© 2006, The University of Manchester
18
What is the Semantic Web?
‣ A vision of a computer-understandable web
‣ Distributed knowledge and data in reusable
form
© 2006, The University of Manchester
19
Semantic Web Languages
‣ On the Semantic Web, we expect people to
extend our models
‣ But we don’t want to worry in advance how
© 2006, The University of Manchester
20
Incomplete Information
‣ The OWA assumes incomplete information by
default
‣ We can intentionally underspecify and allow
others to reuse and extend
‣ eg All sharks liveInHabitat some WaterHabitat
‣ Are there fresh/seawater sharks?
‣ Do we care? Someone might
‣ It can be useful to reuse
© 2006, The University of Manchester
21
Reuse is good
‣ Be more specific when the application
demands it
‣ In OWL, we extend an ontology by adding
statements. ie we can not take any away
‣ By only committing to an answer if there is a
statement to back it up, OWL remains
monotonic
‣ if we extend an ontology, all existing true
statements remain true
© 2006, The University of Manchester
22
Interpreting Knowledge
‣
‣
‣
‣
Is there a speaker at tea/coffee?
Are there going to be biscuits at this meeting?
Time
Activity
Speaker
09:00
Welcome
Jessie Kennedy
9:10
Data webs: new visions for research David Shotton
9:40
Closed World Assumption
Chris Date
10:25
Open World Assumption
Nick Drummond
10:40-11:00
Tea/Coffee
11:00
The Semantic Gap between
Databases and Ontologies
Catherine Dolbear
11:30
Nullogy
Chris Date
CWA says “No”
OWA says “Don’t know” unless a blank is interpreted as “Activity and not(hasSpeaker)”
© 2006, The University of Manchester
23
Interpreting Knowledge
‣
‣
‣
I want to treat my patient with a painkiller that is not an anticoagulant
Drug
Effect
Aspirin
Painkiller
Wharfarin
Anticoagulant
Paracetemol
Painkiller
CWA says “Aspirin”, “Paracetemol”
OWA can’t say this unless we make explicit “Paracetemol is not an anticoagulant”
© 2006, The University of Manchester
24
How do we choose?
‣ Not always clear cut
‣ Many problem domains have aspects of both
Open World Problem
Closed World Problem
Does Nick Drummond know Chris
Date? Rob Shearer?
Is there a train from Manchester to
Edinburgh today? (only x trains and y
destinations)
Do we bomb/trust this battlefield unit?
Find me drugs that are not licensed for
Is drug X suitable for treating disease X? (would need closure for each)
Y?
Has my package been delivered yet?
© 2006, The University of Manchester
25
Why the Open World?
‣ Underspecification
‣ abstract, nested and unnamed entities
‣ Easily reusable (and extendable)
‣ Good at knowledge level (Ontology)
‣ Good at “schema”-”schema” mapping
‣ eg asserting/inferring equivalents
‣ They naturally deal with incomplete information
‣ eg Domain knowledge (eg science) - where we
don’t know all of the answers yet
© 2006, The University of Manchester
26
Why not(Open World)?
‣ Paradigm shift
‣ Involves technology/experience catch up
‣ Some problems are inherently closed world (often those that
we ask “which are not...” or have a finite number of elements)
‣ but is possible to close the open world
‣ Dealing with defaults/exceptions
‣ CWA good at dealing with schema-data mapping
‣ integrity constraints, validation (parsing, form
generation)
‣ Data structures are typically closed
‣ Meta-query
‣ What do we know???
© 2006, The University of Manchester
27
Conclusion
‣ OWA is good for describing knowledge in a
way that is extensible
‣ CWA is good for constraining and validating
data
‣ OWA and CWA are different ways of
interpreting data and can be used alongside
each other
‣ To overcome the difficulties of queries in each,
perhaps we need decent explanation support
for entailments?
© 2006, The University of Manchester
28
Thankyou
thanks to Alan Rector, Robert Stevens, Boris
Motik, Héctor Pérez-Urbina, Bijan Parsia and
BHIG at Manchester
© 2006, The University of Manchester
Questions?
© 2006, The University of Manchester
Other Issues
‣ Over/under constraining
‣ SPARQL, RDQL
‣ Single vs Multi model?
© 2006, The University of Manchester
31
Terminology note - “Constraints”
‣ Much confusion
‣ Can mean...
‣ Integrity constraints
‣ prevent “incorrect” values from being asserted in a model
‣ used for validation/parsing/data input
‣ single model (usually) that contains only the facts asserted
‣ logical axioms
‣
‣
‣
‣
eg restrictions, property domain/range
everything can be true unless proven otherwise
multiple possible models can satisfy the axioms
this may cause some unintuitive inferences
© 2006, The University of Manchester
32
Unique Name Assumption (UNA)
‣ If 2 things have different names (IDs) they are,
by default, different
‣ But...
Gnashers
Teeth
Dientes
Dents
Pearly Whites
Zähne
Denti
© 2006, The University of Manchester
33
Unique Name Assumption
‣ CWA typically makes the UNA
‣ Useful for counting
‣ OWA doesn’t (always) make the UNA
‣ To allow later assertion that two things are the
same or different (or this may be inferred)
‣ note: negation is required for distinctness
‣ RDF cannot make assertions about things being different
‣ OWL and many other logics can
© 2006, The University of Manchester
34
Closure of Open World
‣ Common or garden closure
‣ disjoints, universals, covering & closure axioms
‣ Domain, Concept, Role closure
‣ K operator, query set subtraction etc
‣ where to handle non-monotonicity
‣ (in query/app level, once only transform?)
© 2006, The University of Manchester
35
This talk is not...
‣ OWL vs Databases
‣ Databases deal with HOW data is stored - you
can store OWL in databases
‣ OWL is about representing knowledge with
machine understandable semantics
© 2006, The University of Manchester
36
Download