uos-prep - Ontolog

advertisement
Upper Ontology Symposium
For Your Eyes Only
Dr. Douglas B. Lenat
, 3721 Executive Center Drive, Suite 100, Austin, TX 78731
Email: Lenat@cyc.com
Phone: (512) 342-4001
March 14, 2006
2 July 2005
1
Upper Ontology Symposium
For Your Eyes Only
•
•
•
•
•
•
Tues. a.m.: issues talk, to UO community 15 min
Tues. p.m.: review my 1st public Wed. talk
Wed. a.m.: semi-public: UO Applications 15 min
Wed. p.m.: public: value of formal ontol 25min
Wed. p.m.: public: about our communique 10min
Thu.: related talk at Mitre for S&T analysts
March 14, 2006
2
Upper Ontology Symposium
• To first order: I agree with the communiqué (apple pie)
• How/why I was forced into this field
• Upper ontology mostly just impacts efficiency
– Of the vocabulary (lower ontology): fewer terms, simpler terms
– Of the axioms: fewer, terser, less ambiguous
– Of the various types of cross-ontology mapping axioms
• What needs to be shared
• No “correct” UO; and yet no need for separate indep. UO’s
– Have contexts (“microtheories”) and an ist relation
• Ontologies at that point seem to be normal 1st-class objects
• As with any important region of the ontology, facet that
– 12 useful (categories of) facets or “dimensions” of ontology-space
• Just a few remarks about OpenCyc and ResearchCyc
March 14, 2006
3
Upper Ontology Symposium
• To first order: I agree with the communiqué (apple pie)
• How/why I was forced into this field
• Upper ontology mostly just impacts efficiency
– Of the vocabulary (lower ontology): fewer terms, simpler terms
– Of the axioms: fewer, terser, less ambiguous
– Of the various types of cross-ontology mapping axioms
• What needs to be shared
• No “correct” UO; and yet no need for separate indep. UO’s
– Have contexts (“microtheories”) and an ist relation
• Ontologies at that point seem to be normal 1st-class objects
• As with any important region of the ontology, facet that
– 12 useful (categories of) facets or “dimensions” of ontology-space
• Just a few remarks about OpenCyc and ResearchCyc
March 14, 2006
4
How/why I was forced into this
Goal: Amplify human beings via ubiquitous real AI
Reality: Throughout the 1960’s and 1970’s, every
subfield of AI kept hitting the same brick wall:
BRITTLENESS BOTTLENECK
NL understanding, speech understanding,
robotics, learning, expert systems, search,
semantic database integration,…
(Programs need massive amounts/coverage of
common sense and general world knowledge)
March 14, 2006
5
UO mostly just impacts efficiency
• Of the vocabulary (lower ontology):
fewer and simpler terms
Ex: non-souled trees and souled trees.
Ex: FranceIn1985
Ex: grue and bleen
• Of the axioms: fewer, terser, less ambiguous
– Ex: things grue by day are usually bleen at night
– Ex: when smurfing a car, first smurf the key
• Of the cross-ontology mapping axioms
March 14, 2006
6
Thing
Intangible
Individual
SpatialThing
Situation
SetOr
Collection
Intangible
Individual
Temporal
Thing
Mathematical
Object
Event
Something
Existing
Static
Situation
Partially
Tangible
Role
Time
Interval
Physical
Event
Relationship
Attribute
Value
TruthFunctional
FunctionDenotational
Logical
Connective
Predicate
Partially
Intangible
SetMathematical
Quantifier
ActorSlot
Configuration
Composite
TangibleAnd
IntangibleObject
Intangible
Existing
Thing
March 14, 2006
7
Collection
So if the UO mostly impacts
efficiency, where is the power?
Water is wet
upper
ontology
Vehicles slow down in bad weather
task-specific knowledge
HUMMV’s lose 18% traction in 4-inch-deep mud
March 14, 2006
8
So if the UO mostly impacts
efficiency, where is the power?
• The Upper Level need only be adequate
• The Intermediate Level is locus of power
• The Lower Levels supply the minutiae
So Upper + Intermed. is what we need to share
March 14, 2006
9
Answering even an innocuous-sounding question:
“Can vehicle X get from Y to Z by time t ?”
may require intermediate-level knowledge about
localized spatial things, pathways,
earth sciences, weather, topography,
oceanography (depth, temperature, biota),
terrain, transportation, industry, vehicles,
geopolitics (“international waters”),
communications, the driver, holidays, ...
So Upper + Intermed. is what we need to share
March 14, 2006
10
What Needs to be Shared?
•
•
•
•
•
•
•
bits/bytes/streams/network…
alphabet, special characters,…
words, morphological variants,…
syntactic meta-level markups (HTML)
semantic meta-level markups (SGML, XML)
content (logical representation of doc/page/...)
context (models of the user’s prior/tacit knowledge
(incl. common sense, recent history), wants/needs,
budget,…and n dimensions of metadata: time, space,
level of granularity, the source’s purpose/ideology...)
March 14, 2006
11
To do the logical/arithmetic combination
across information sources, we need
What
Needs
to
be
Shared?
tens of thousands of relations, not tens
• DAML+OIL,
bits/bytes/streams/network…
OWL add a few more distinctions:
• alphabet, special characters,…
inverses,
unambiguous
properties,
unique
• words, morphological variants,…
properties, lists, restrictions, cardinalities,
• syntactic meta-level markups (HTML)
pairwise disjoint lists, datatypes, …
• semantic meta-level markups (SGML, XML)
• content (logical representation of doc/page/...)
Tiny vocabulary (# distinctions) of standard relations:
• context
(modelslabel,
of the domain,
user’s prior/tacit
knowledge
rdf:type,
subclass,
range, comment,…
(incl. common sense, recent history), wants/needs,
Beyond n
which
diversity
is tolerated
budget,…and
dimensions
of metadata:
time, space,
Which
means divergence
inevitable
level
of granularity,
the source’sispurpose/ideology...)
“What do you mean we have no standard, we have lots of standards!”
March 14, 2006
12
To do the logical/arithmetic combination
across information sources, we need
What
Needs
to
be
Shared?
tens of thousands of relations, not tens
•
•
•
•
•
•
•
bits/bytes/streams/network…
alphabet, special characters,…
words, morphological variants,…
syntactic meta-level markups (HTML)
semantic meta-level markups (SGML, XML)
content (logical representation of doc/page/...)
context (common sense, recent utterances, and n
dimensions of metadata: time, space, level of
granularity, the source’s purpose, etc.)
March 14, 2006
13
There is no “correct” UO
•
•
•
•
Are apes monkeys?
Are poinsettias red flowers?
Do we need to distinguish instance & subtype?
Are these two terms one and the same thing?
– Black US Presidents in the 20th Century
– Female US Presidents in the 20th Century
• Davidsonian reification of events or not?
March 14, 2006
14
(marriedIn <groom> <bride> <wedding> <date>)
Events are rich (no limit to the number of arguments)
(groom Wedding0947 JoeSmith)
(bride Wedding0947 JaneDoe)
(dateOfEvent Wedding0947
(DayFn 13 (MonthFn May (YearFn 1999))))
March 14, 2006
15
March 14, 2006
16
No need for separate ontologies
• Are apes monkeys?
• Are poinsettias red flowers?
(ist
<context>
<assertion>)
• Do
we need
to distinguish instance & subtype?
Each
of these
is true
some contexts
and
false
in others
• Are
these
twointerms
one and
the
same
thing?
Contexts
(microtheories)
are themselves
in the ontology.
– Black
US Presidents
in the 20thterms
Century
(genlMt
HockeyMt
SportsMt)in the 20th Century
– Female
US Presidents
12• facets
or dimensions
that (largely)
a Mt.
Davidsonian
reification
of characterize
events or not?
March 14, 2006
17
“If it’s raining, carry an umbrella”
 the performer is a human being,
 the performer is sane,
 the performer can carry an umbrella; thus:







the performer is not a baby, not unconscious, not dead,
the performer is going to go outdoors now/soon,
their actions permit them a free hand (e.g., not wheelbarrowing)
their actions wouldn’t be unduly hampered by it (e.g., marathon-running)
the wind outside is not too fierce (e.g., hurricane strength)
the time period of the action is after the invention of the umbrella
the culture is one that uses umbrellas as a rain- (not just sun-)protection device,
the performer has easy access to an umbrella; thus:
not too destitute, not someone who lives where it practically never rains,
not at the office/theater/… caught without an umbrella
 the performer is going to be unsheltered for some period of time
the more waterproof their clothing, the gentler the rain, and
the warmer the air, the longer that time period
 the performer will not be wet anyway (e.g., swimming)
 the rain is annoying -- but merely annoying. Thus:
not ammonia rain on Venus, radioactive post-apocalyptic rain,
biblical (Noah’s-ark-sized, or frogs/blood as rained on Pharaoh)
the performer is not a hydrophobic person, gingerbread man, etc.,
and not a hydrophilic person, someone dying of thirst, etc.
March 14, 2006
18
12 Dimensions of Ontol. Contexts
•
•
•
•
•
•
•
•
•
•
•
•
Anthropacity / Let’s
Time
GeoLocation
TypeOfPlace
TypeOfTime
Culture
Sophistication/Security
Topic
Granularity
Modality/Disposition/Epistemology
Argument-Preference
Justification
March 14, 2006
19
How we evaluate proposed dimensions
Criteria:
 Do they separate out mutually-irrelevant (and esp.
mutually-incompatible) portions of the KB?
 Is it easy for Cyc to mechanically compute the overlap
or disjointness of regions of n-dim. context-space?
 Cognitive assonance: Do they (esp. their extrema)
correspond to familiar real-world notions?
 Using them, is it empirically faster to enter assertions?
 Using them, is it empirically faster to do inference?
March 14, 2006
20
Context feature: Time
• The piece of time (the 1920s, the first five years after WWII,
the Pleistocene Era) in which a context’s assertions hold.
• Useful because:
• Facts about very distant time periods are often
mutually irrelevant; if stated tersely, they are often
inconsistent.
• Inefficient to temporally qualify each assertion
individually.
• In many reasoning contexts, causes precede effects by
a small amount of time.
March 14, 2006
21
Context feature: Spatial Location
• The piece of space (Lebanon, my bloodstream, the Southern
Hemisphere, Mike Ditka’s backyard) in which a context’s
assertions hold.
• Useful because:
• Facts about very distant locations are often mutually
irrelevant; if stated tersely, they are often inconsistent.
• Inefficient to spatially qualify each assertion individually.
• In many reasoning contexts, interacting objects and events
are usually spatially proximate.
March 14, 2006
22
Context feature: Culture
• The cultural point of view assumed by the assertions in a
context.
• This dimension has many subdimensions, e.g.:
• political culture, sexual culture, sexual orientation culture,
age culture, generation culture, religious culture, ancestral
culture, geo-political culture, regional culture, region-type
culture, legal culture, and more
March 14, 2006
23
Culture, ctn’d
• Useful because:
• In many reasoning contexts, some cultural perspective (or
set of perspectives) is assumed, and other perspectives are
not relevant.
• Accuracy of applications which involve reasoning about
agents’ intent and expectations requires sensitivity to
variations in cultural context.
• Many good ways to infer the cultural POV of author.
• This dimension is quite possibly the most difficult of all: very
complex, very hard to separate fact from preconception. Very
hard to maintain objectivity and not antagonize everyone.
March 14, 2006
24
Context feature: Sophistication
• The level of information, education, intelligence or other capacity
for knowledge assumed by the assertions in this context.
• What capacity for knowledge would a person need in order to:
– Understand the assertions, to learn them in this form,
– Recognize them as true, once they are hinted at or stated,
– Already be familiar with the assertions, at least theoretically,
– Have already deeply assimilated the content of the assertions?
• Useful for dialogue, and collaborative planning applications.
March 14, 2006
25
Context feature: Granularity
• The level of coarseness assumed by assertions in a context.
• This dimension has many potential subdimensions:
– size of objects, duration of events, parts, subevents,
suborganizations, specificity or abstraction of classes
(collections), relationships, measurements
– Ex: Newtonian vs Relativistic vs Quantum physics
• Useful because: answers often vary drastically depending on the
granularity desired.
March 14, 2006
26
Evaluating proposed dimensions
Criteria:
 Do they separate out mutually-irrelevant (and esp.
mutually-incompatible) portions of the KB?
 Is it easy for Cyc to mechanically compute the overlap
or disjointness of regions of n-dim. context-space?
 Cognitive assonance: Do they (esp. their extrema)
correspond to familiar real-world notions?
 Using them, is it empirically faster to enter assertions?
 Using them, is it empirically faster to do inference?
March 14, 2006
27
Mathematical Factoring of
MetaData Dimensions
UnitedStatesIn1985Context:
Ronald
Reagan900,000
is president.
There are
at least
doctors.
PennsylvaniaIn1985Context:
Dick Thornburgh is governor.
LehighCountyInFebruary1985Context:
Dick Thornburgh
Thornburgh is
is governor
governor and
and there
Ronald
Dick
Reagan
president.
are atisleast
900,000 doctors.
March 14, 2006
28
Time Indices and Granularities
Doug is talking,
talking. at 1:45 to 2:45, on 5/4/04.
Therefore:
Doug is talking, at 2:05 to 2:40, on 5/4/04.
But not:
Doug is talking, at 2:11:15, on 5/4/04.
March 14, 2006
29
Calculi for deciding (dimension by dimension) in
Backward
Inference
what context we can assert a logical conclusion
Pa
(implies Px Qx)
t1
t2
tt34
Qa ?
Qat4can
be inferred
t3, withofgranularity
, if tof
some
If
subsumes
some at
instance
the granularity
Pa, 1, and
some
3 subsumes
instance
Q), 2, then
Qa is
instance of
of the
the granularity
granularity of
of (implies
Pa, and someP instance
of the
inferred
at t4of
, with
each granularity
theset
of least
minimal
upper
granularity
(implies
Px Qx),inand
is at
as big
as both
bounds
(1 2).
of theseof
granularities.
March 14, 2006
30
Inferring Context Subsumption
(genlMt
(MtSpace
LebanonDataContext
June1985LebanonDataContext
(TimeIndex: June, 1985)
(TemporalGranularity: Month)
(SpatialGranularity: Governorate))
(MtSpace
SouthWestAsiaDataContext
1985SouthWestAsiaDataContext
(TimeIndex: 1985)
(TemporalGranularity: Day)
(SpatialGranularity: SquareMile)))
March 14, 2006
31
Getting back to:
No need for separate ontologies
• Declarative assertions that map them to Cyc
• And thereby map between them
(using Cyc as an interlingua)
• Create a context or Mt for each external
ontology O; eventually, there is enough in and
about each such Mt that it almost subsumes O.
• “Almost” because O might be optimized in
some way repr./algorithmically (e.g., a DB)
March 14, 2006
32
WordNet 2.0
Penman Upper Model
Federal information Processing Standard 10-4
Veridian Contract Killing DB
SNOMED 3.3
the LSCOM object and situation ontology
NGA Names-Places DB
the 2002 nuSketch KB
the Veridian Ontology
Feature and Attribute Coding Catalogue
MeSH 1997
CNLP Ontology
SENSUS-Information1997
the Open GIS Simple Features Specification for SQL
Open GIS Simple Features Specification For SQL
the SRA Template Extraction Ontology
Unicode
CC2 Ontology Notes
CPR Ontology
the Inxight Thing Finder classification category database
Military Standard 2408
Horus Person Organization Ontology
Unified Soil Classification System
Horus Installation Ontology
ConceptNet 2.0
the Syracuse Ontology
NASIS
PLANET Indexed Information
Source
March
14, 2006
33
"(synonymousExternalConcept TERM SOURCE STRING)
means that the CycL expression TERM is synonymous
with at least one of the interpretations of STRING in the
external data source SOURCE."
(synonymousExternalConcept InnerEar
MeSH-Information1997 "Labyrinth | A9.246.631")
(synonymousExternalConcept Temperature CNLPOntology
"temp")
(synonymousExternalConcept Concerto
WordNet-Version2_0
"N06611782")
(synonymousExternalConcept
PowerGenerationComplex-Nuclear
LSCOMObjectAndSituationOntology
“power plant (nuclear)” )
March 14, 2006
34
"(overlappingExternalConcept TERM SOURCE STRING)
means that the CycL expression TERM overlaps
semantically with at least one of the interpretations of
STRING in the external data source SOURCE."
(overlappingExternalConcept TextualPCW CNLPOntology
"document")
(overlappingExternalConcept defectors
HorusPersonOrganizationOntology "splinterFromOrg")
(overlappingExternalConcept SpleniusCapitis
MeSH-Information1997 "Neck Muscles | A2.633.567.650")
March 14, 2006
35
"(codeMapping MAP CODE DENOTATION)
specifies one mapping for the reified mapping MAP. When
a table uses MAP to interpret some field, the value CODE
in that field will be interpreted as DENOTATION."
(codeMapping FACC-FeatureType-CMLS "BH100"
Moat)
(codeMapping NGA-FeatureType-CMLS
(GroupFn Plain-Topographical))
"PLN"
(codeMapping FACC-FeatureType-CMLS "GB020"
AircraftArrestingGear)
(ForAll ?x
(fieldDecoding USGS-GNIS-LS ?x
(TheFieldCalled “population”)
(numberOfInhabitants
(TheReferentOfTheRow USGS-GNIS) ?x)))
March 14, 2006
36
Upper Ontology Symposium
• To first order: I agree with the communiqué (apple pie)
• How/why I was forced into this field
• Upper ontology mostly just impacts efficiency
– Of the vocabulary (lower ontology): fewer terms, simpler terms
– Of the axioms: fewer, terser, less ambiguous
– Of the various types of cross-ontology mapping axioms
• What needs to be shared
• No “correct” UO; and yet no need for separate indep. UO’s
– Have contexts (“microtheories”) and an ist relation
• Ontologies at that point seem to be normal 1st-class objects
• As with any important region of the ontology, facet that
– 12 useful (categories of) facets or “dimensions” of ontology-space
• Just a few remarks about OpenCyc and ResearchCyc
March 14, 2006
37
Cyc Knowledge Base
Cyc contains:
15,000 Predicates
68,000 Collections
300,000 Concepts
3,200,000 Assertions
Intangible Individual
Thing
Sets
Relations
Space
Physical
Objects
Living
Things
Ecology
Natural
Geography
Political
Geography
Weather
Earth &
Solar System
Paths
Actors
Actions
Movement
State Change
Dynamics
Plans
Goals
Physical
Agents
Plants
Human
Anatomy &
Physiology
Temporal
Thing
Partially
Tangible
Thing
Logic
Math
Borders
Geometry
Animals
Emotion
Human
Products Conceptual
Perception Behavior &
Devices
Works
Belief
Actions
Vehicles
Buildings
Weapons
Spatial
Thing
Spatial
Paths
Materials
Parts
Statics
Life
Forms
Human
Beings
Human
Artifacts
Represented in:
• First Order Logic
• Higher Order Logic
Time
• Context Logic
Events
Scripts
• Micro-theories
Agents
Artifacts
Thing
Mechanical
Software
Social
& Electrical Literature Language Relations,
Devices
Works of Art
Culture
Organizational
Actions
Organizational
Plans
Agent
Organizations
Social
Behavior
Organization
Social
Activities
Human
Activities
Business &
Commerce
Purchasing
Shopping
Types of
Organizations
Politics
Warfare
Sports
Recreation
Entertainment
Transportation
& Logistics
Human
Organizations
Nations
Governments
Geo-Politics
Professions
Occupations
Travel
Communication
Law
Everyday
Living
Business,
Military
Organizations
General Knowledge about Various Domains
Specific data, facts, and observations
March 14, 2006
38
OpenCyc
Cyc contains:
15,000 Predicates
68,000 Collections
300,000 Concepts
3,200,000 Assertions
+ 1M taxon./mereol. Axioms
+ inference engines, interfaces
ResearchCyc
All of that + whole Cyc KB
March 14, 2006
39
Temporal Relations
37 Relations Between Temporal Things
#$temporalBoundsIntersect
#$temporallyIntersects
#$temporalBoundsContain
#$temporalBoundsIdentical
#$startsAfterStartingOf
#$startsDuring
#$endsAfterEndingOf
#$overlapsStart
#$startingDate
#$startingPoint
#$temporallyContains
#$simultaneousWith
#$temporallyCooriginating
#$after
March 14, 2006
40
Senses of ‘Part’
#$parts
#$intangibleParts
#$subInformation
#$subEvents
#$physicalDecompositions
#$physicalPortions
March 14, 2006
#$physicalParts
#$externalParts
#$internalParts
#$anatomicalParts
#$constituents
#$functionalPart
41
Senses of ‘In’
• Can the inner object leave by passing between
members of the outer group?
– Yes -- Try #$in-Among
March 14, 2006
42
Senses of ‘In’
• Does part of the inner
object stick out of the
container?
– If the container were
turned around could
the contained object
fall out?
– None of it. -- Try
#$in-ContCompletely
Yes -- Try
#$in-ContOpen
– Yes -- Try
#$in-ContPartially
No -- Try
#$in-ContClosed
March 14, 2006
43
Senses of ‘In’
Is it attached to the inside
of the outer object?
– Yes -- Try
#$connectedToInside
Can it be removed, if
enough force is used,
without damaging either
object?
– Yes -- Try #$in-Snugly or
#$screwedIn
Does the inner object
stick into the outer
object?
Yes -- Try #$sticksInto
March 14, 2006
44
Event Types
#$PhysicalStateChangeEvent
#$TemperatureChangingProcess
#$BiologicalDevelopmentEvent
#$ShapeChangeEvent
#$MovementEvent
#$ChangingDeviceState
#$GivingSomething
#$DiscoveryEvent
#$Cracking
#$Carving
#$Buying
#$Thinking
#$Mixing
#$Singing
#$CuttingNails
#$PumpingFluid
11,000 more
March 14, 2006
45
Relations Between
an Event and its Participants
#$performedBy
#$causes-EventEvent
#$objectPlaced
#$objectOfStateChange
#$outputsCreated
#$inputsDestroyed
#$assistingAgent
#$beneficiary
#$fromLocation
#$toLocation
#$deviceUsed
#$driverActor
#$damages
#$vehicle
#$providerOfMotiveForce
#$transportees
Over 400 more.
March 14, 2006
46
Propositional Attitudes
Relations Between Agents and Propositions
•
•
•
•
•
•
#$goals
#$intends
#$desires
#$hopes
#$expects
#$beliefs
•
•
•
•
•
•
March 14, 2006
#$opinions
#$knows
#$rememberedProp
#$perceivesThat
#$seesThat
#$tastesThat
47
Devices
• Over 4000 Specializations
of #$PhysicalDevice
Device Specific
Predicates
•
– #$ClothesWasher
– #$NuclearAircraftCarrier
•
• Vocabulary for Describing
Device Functions
– #$primaryFunction-DeviceType
March 14, 2006
#$gunCaliber
#$speedOf
Device States (40+)
#$DeviceOn
#$CockedState
48
Lexical Entry Example: Coke
Constant : Coke-TheWord
isa : EnglishWord
Mt : EnglishMt
singular : “coke”
massNumber : “coke”
pnSingular : “Coke”
pnMassNumber : “Coke”
(denotation Coke-TheWord ProperCountNoun 0 (ServingFn CocaCola))
(denotation Coke-TheWord ProperMassNoun 0 CocaCola)
(denotation Coke-TheWord MassNoun 0 Cocaine-Powder)
(denotation Coke-TheWord MassNoun 2 ColaSoftDrink)
(denotation Coke-TheWord SimpleNoun 0 (ServingFn ColaSoftDrink)
<various other denotations of the English word “coke”>
March 14, 2006
49
Lexical Entry Example: Eat
Constant: Eat-TheWord
isa: EnglishWord
Mt: EnglishMt
infinitive: “eat”
perfect: “eaten”
pastTense: “ate”
agentive-Sg: “eater”
(subcatFrame Eat-TheWord Verb 0 TransitiveNPCompFrame)
(verbSemTrans Eat-TheWord 0 TransitiveNPCompFrame
(and (isa :ACTION EatingEvent)
(performedBy :ACTION :SUBJECT)
(inputsDestroyed :ACTION :OBJECT)))
March 14, 2006
50
Cyc NL Lexicon
English Words
Syntactic Frame Links
Single-word Denotation Mappings
Multi-word Phrase Denotation Mappings
Verbal Semantic Frame Links
Noun Semantic Frame Links
Pragmatic Assertions
Names (Includes chemical symbols, person/place/organizatioin
18,737
13,922
25,999
43,508
3,517
2,396
1,232
171,093
names, acronyms, etc.)
Predicate-based Phrasal Links (genTemplates for
paraphrase)
10,327
Geospatial Classes
1100 Atomic types, 338
 functionally specified ones
•
•
•
•
•
•
•
•
•
•
•
•
CrabFishery
LakeBed
MonsoonForest
MudFlat
USCS-Code-CL
Glacier
Ridge
Butte
Cave
MinedArea
PostalCodeRegion
Prefecture
TownSquare
Quarry
Atoll
Continent
TrueContinent
(FieldFn OliveTree)
(CityInCountryFn Cuba)
Protectorate
IndependentCountry
Colony
SchoolDistrict
Monarchy
March 14, 2006
52
Predicates of Geospatial Entities
Over 500
•
•
•
•
•
•
•
terrainType
maximumDepth
cloudCeiling
importsFrom
regionalPastimes
populationDensity
trafficableForVehicle
freightRailTrafficRate
internetCountryCode
hasClimateType
languagesSpokenHere
highestPointInRegion
waterAreaOfRegion
canopyClosureOfRegion
March 14, 2006
53
OpenCyc
Open Source release of: 300k-term Cyc
Ontology + 1M Simple Relns. + 720 Inference
Engines, NL Parser/Generator, Ontol Mappers
ResearchCyc
All of Cyc (free, for R&D purposes)
FACTory
Free online “match game” to check/add to the
ontology and, more generally, to the Cyc KB
March 14, 2006
54
90,000 OpenCyc Users/Contributors,
73 Active ResearchCyc User Groups:
Government-related
Government
Language Computer
Corporation
Air Force
Rome Labs
21st
Century
Technologies
Houston
VA Medical Center
Xerox PARC
Stone’s Throw
Technologies
SRI
ISI
Daxtron Labs
Austin Info Systems
Lockheed Martin ATLD
U of Illinois Urbana-Champaign
University
U of Maryland
MIT Media Lab
Northwestern U
Commercial
ANSER, Inc.
NTT
Communications Science
Laboratories (Japan)
Fraunhofer Institute
Sapio Systems (Denmark)
Terra Incognita
Trimtab Consulting
Stanford NLP Dept.
TNO-DMV (Netherlands)
U of Pennsylvania
Rensselaer AI and Reasoning Lab
Microfabrica, Inc.
LBJ School of
Public Affairs
U of Toronto
Radboud U
(Netherlands)
Knowledge Media
Institute, Open
University
U of Stuttgart
U of Minnesota
Witan International
New Mexico
Highlands Univ.
Harvard U
Linkoping U
(Sweden)
U of Hawaii
Institute for the Study
Of Accelerating Change
NPOs
Tokyo Inst.
of Technology
March 14, 2006
55
Upper Ontology Symposium
• To first order: I agree with the communiqué (apple pie)
• How/why I was forced into this field
• Upper ontology mostly just impacts efficiency
– Of the vocabulary (lower ontology): fewer terms, simpler terms
– Of the axioms: fewer, terser, less ambiguous
– Of the various types of cross-ontology mapping axioms
• What needs to be shared
• No “correct” UO; and yet no need for separate indep. UO’s
– Have contexts (“microtheories”) and an ist relation
• Ontologies at that point seem to be normal 1st-class objects
• As with any important region of the ontology, facet that
– 12 useful (categories of) facets or “dimensions” of ontology-space
• Just a few remarks about OpenCyc and ResearchCyc
March 14, 2006
56
If there’s a few more minutes…
Parting sermon:
resist temptation, brothers and sisters!
March 14, 2006
57
Eschew the 5 pitfalls (ways to cut
ontological corners and end up with
something that only appears to work)
• Ignorance-based: Have a small theory size (#terms, #instances, #rules)
• Static KB (can be massively tuned, optimized, cached, etc. ahead of time)
• Simple assertions (e.g., SAT constraints; propositional calculus; Horn;…)
• One global context (no contradictions, limited domain, simplified world)
• Don’t do all the bookkeeping and forward inference required for justification
maintenance (or, equivalently, don’t ever have truth maintenance “turned on”)
March 14, 2006
58
Eschew the 5 pitfalls (ways to cut
ontological corners and end up with
something that only appears to work)
• Ignorance-based: Have a small theory size (#terms, #instances, #rules)
• Static KB (can be massively tuned, optimized, cached, etc. ahead of time)
• Simple assertions (e.g., SAT constraints; propositional calculus; Horn;…)
• One global context (no contradictions, limited domain, simplified world)
• Don’t do all the bookkeeping and forward inference required for justification
maintenance (or, equivalently, don’t ever have truth maintenance “turned on”)
As with pharmaceuticals, what is toxic in one dosage is beneficial in a lesser dosage.
• contexts lead to locally-consistent locally-small theories (faster inference/KE)
• often some (sub)problems can be represented/solved in a simpler repr.
• as a temporary placeholder (cf. Woods’ incremental simulation LUNAR)
March 14, 2006
59
Besides Those 5 Major Pitfalls
there are many minor ones, related to
“taste” and efficiency, not “correctness”
• Constant names should be unambiguous (Coral-Color Coral-Reef Coral-Polyp)
• 99.9…% of the meaning is in the assertions about the terms, not in the names
E.g., if Rthagide-disjaks and Gracinimumples are only known to be Kitchen-Appliances
March 14, 2006
60
Besides Those 5 Major Pitfalls
there are many minor ones, related to
“taste” and efficiency, not “correctness”
• Constant names should be unambiguous (Coral-Color Coral-Reef Coral-Polyp)
• 99.9…% of the meaning is in the assertions about the terms, not in the names
E.g., if Garbage-disposals and Microwave-ovens are only known to be Kitchen-Appliances
March 14, 2006
61
Besides Those 5 Major Pitfalls
there are many minor ones, related to
“taste” and efficiency, not “correctness”
• Constant names should be unambiguous (Coral-Color Coral-Reef Coral-Polyp)
• 99.9…% of the meaning is in the assertions about the terms, not in the names
E.g., if Garbage-disposals and Microwave-ovens are only known to be Kitchen-Appliances
• This applies to variable names, not just constant names
(implies
(in-ContOpen ?BOAT ?RIVER)
(in-Floating ?BOAT ?RIVER))
March 14, 2006
62
Besides Those 5 Major Pitfalls
there are many minor ones, related to
“taste” and efficiency, not “correctness”
• Constant names should be unambiguous (Coral-Color Coral-Reef Coral-Polyp)
• 99.9…% of the meaning is in the assertions about the terms, not in the names
E.g., if Garbage-disposals and Microwave-ovens are only known to be Kitchen-Appliances
• This applies to variable names, not just constant names
(implies
(and (in-ContOpen ?BOAT ?RIVER) (isa ?BOAT Boat) (isa ?RIVER River))
(in-Floating ?BOAT ?RIVER))
March 14, 2006
63
Besides Those 5 Major Pitfalls
there are many minor ones, related to
“taste” and efficiency, not “correctness”
• Constant names should be unambiguous (Coral-Color Coral-Reef Coral-Polyp)
• 99.9…% of the meaning is in the assertions about the terms, not in the names
E.g., if Garbage-disposals and Microwave-ovens are only known to be Kitchen-Appliances
• This applies to variable names, not just constant names
(implies
(and (in-ContOpen ?BT ?RVR) (isa ?BT Boat) (isa ?RVR River))
(in-Floating ?BT ?RVR))
March 14, 2006
64
Minor Pitfall: Over-generalization
• “Every organism has a head”
Replace it by a few rules for vertebrates, insects,…
• “Every person has some building or part of
a building as their home”
Assert it only in some context(s) [culture, time period,…]
• “All people speak some language”
State exceptions (infants, coma victims,…)
March 14, 2006
65
Minor Pitfall: Over-specialization
• “Every person was born later than his mother”
animal & ancestor; created thing & creator; cause/effect
• (implies
(and
(isa ?MST Mast-Device)
(physicalParts ?BOT ?MST)
(isa ?BOT Sailboat))
(rigidityOfObject ?MST Rigid))
(relationAllInstance rigidityOfObject Mast-Device Rigid)
March 14, 2006
66
Minor Pitfall: Independent
Assertions Glommed Together
Sailboats have masts and hulls.
(implies
(isa ?BOT Sailboat)
(thereExists ?MST
(thereExists ?HUL
(and (isa ?MST Mast-Device)
(isa ?HUL Hull-BoatPart)
(physicalParts ?BOT ?MST)
(physicalParts ?BOT ?HUL)))))
March 14, 2006
67
Minor Pitfall: Independent
Assertions Glommed Together
Sailboats have masts and hulls.
(implies (isa ?BOT Sailboat)
(implies
(thereExists ?MST
(isa ?BOT Sailboat)
(and (isa ?MST Mast-Device)
(physicalParts ?BOT ?MST))))
(thereExists ?MST
(relationAllExists physicalParts Sailboat
(thereExists ?HUL
Mast-Device)
Boat
(and (isa ?MST Mast-Device)
(implies (isa ?BOT Sailboat)
(isa ?HUL Hull-BoatPart)
(thereExists ?H
(physicalParts ?BOT ?MST)
(and (isa ?MST Hull-Boat)
(physicalParts ?BOT ?HUL)))))
(physicalParts ?BOT ?H))))
(relationAllExists
Hull-BoatPart)
March 14, 2006
physicalParts
68
Boat
Minor Pitfall: Independent
Assertions Glommed Together
Sailboats have masts. Boats have hulls.
(relationAllExists
Mast-Device)
(relationAllExists
Hull-BoatPart)
March 14, 2006
physicalParts Sailboat
physicalParts
69
Boat
Minor Pitfall: Predicates that lump
independent properties together
(marriedIn <groom> <bride> <wedding> <date>)
Events are rich (no limit to the number of args)
(groom Wedding0947 JoeSmith)
(bride Wedding0947 JaneDoe)
(dateOfEvent Wedding0947
(DayFn 13 (MonthFn May (YearFn 1999))))
• (non)Davidsonian choice: impact on link extraction/recognition
• Not all situations are rich! E.g., (nextInteger 87 88)
March 14, 2006
70
Minor Pitfall: Predicates that hide
concepts in argument order
Problematic:
(teamLineup DallasCowboys-1998 TroyAikman EmmittSmith MichaelErvin . . .)
Instead, reify the positions:
(positionOfPersonInOrganization TroyAikman
DallasCowboys-1998 Quarterback)
(positionOfPersonInOrganization EmmittSmith
DallasCowboys-1998 RunningBack)
. . .
Every football team has >=1…
The quarterback’s role in a play is …
March 14, 2006
71
Minor Pitfall: Over-reification
TheGovernmentOfFrance, TheGovernmentOfFranceIn1997,
TheGovernmentOfSpain, TheGovernmentOfSpainIn1997,…
Functions, such as: (GovernmentFn France)
Contexts, such as: (DuringMt 1997 (GovernmentFn France))
Kilometer, Kilogram, Kilocalorie…
(unitMultiplicationFactor (Kilo ?UNIT) ?UNIT 1000)
(unitMultiplicationFactor (Kilo Meter) Meter 1000)
(resultIsa (Kilo Meter) Distance)
((Kilo Meter) 8.3)
March 14, 2006
72
Minor Pitfall: Manual NL Generation
• Formal axioms about, e.g., employees, UN
• Informal English comment describing the
intended meaning, scope, etc., of the term
• Danger: bugs creep in: mismatches between
what the English and the axioms say about
the employees relation, or about the UN.
• Solution (also faster): auto. generate the NL
March 14, 2006
73
End of sermon
End of Talk#1
March 14, 2006
74
Download