Cyc Patrick McCauley CMSC 691S - Semantic Web Spring 2009

advertisement
Cyc
Patrick McCauley
CMSC 691S - Semantic Web
Spring 2009
Why do we need Cyc?

Evolution of rule-based expert systems



Rule-based systems are brittle



MYCIN - diagnosis of blood infections
DENDRAL - chemical analysis
Cannot detect typos
Only work within a specific domain
Brittleness due to lack of “common sense”
What is Cyc?



All apps can benefit from common
sense
Began in 1984 by Doug Lenat
Initial goals:


KB and Ontology Building (pump priming)
NL Understanding / Interactive Dialog
Vocab 101


Knowledge - underlying heuristics that
allow us to reason
Data - Facts or statements about
specific items in the world
Where to start?



How much does a system need to know
in order to be useful?
What kinds of knowledge are
necessary?
How should this knowledge be
represented?
Priming the Pump


Need to encode basic, common sense
knowledge “representing human consensus
reality” - insulting to state these facts to
another person
E.g. “You have ten fingers.”


Assumes ten is a number and that a person has a
specific number of fingers.
E.g. “Cardinals are red.”

Assumes that cardinals are a type of bird and that
birds have feathers which, in this case, are red.
Also assumes “red” is a color.
As data grows, so do
inconsistencies


Too much data gives rise to
inconsistencies
Microtheory



Internally consistent data module
Explicitly represented logical context
Cyc knows or is told which MTs should
be used to solve a problem
Cyc KB Topic Map
How to represent knowledge?


CycL - augmented FOPC
Each assertion in the KB carries a “truth
value”





Monotonically false
Default false
Unknown
Default true
Monotonically true
What about external data?


SKSI - Semantic Knowledge Source
Integration
CycL used to describe external DB
columns
Cyc I/O
Natural Language Processing


Extremely difficult since human
speech/language is ridiculously complex
Written text often violates proper grammar,
but its meaning is understood by humans


Fred saw the plan flying over Zurich.
Fred saw the mountains flying over Zurich.
CycNL to the rescue!


Lexicon - “contains syntactic and
semantic information about English
words”
Relationships between English words
and Cyc constants are stored
CycNL - Syntactic Parser



Uses a phrase-structure grammar,
context free rules
Builds multiple tree structures for each
phrase/sentence
However, some trees do not make
“syntactic” sense
CycNL - Semantic Interpreter


Transforms results into CycL formulas
Result is “pure” CycL
How is this useful to Humans?

Ambient Research Assistant


flexibility and ease of communication are key
Must be capable of “learning”





Deciding what facts to learn
Learning those facts
Learning of rules
Generalizing rules
Testing and revision
Benefits of Assistant


Capable of searching much faster than
humans
Availability - supercedes the 9-to-5
“Truly Intelligent” Assistant



Plan Recognition
Learning
NL
Acknowledgements




CYC Website
 http://www.cyc.com/
CYC: A Large-Scale Investment in Knowledge Infrastructure
 http://www.csee.umbc.edu/691s/papers/cyc95.pdf
Mapping Ontologies into Cyc
 http://www.csee.umbc.edu/691s/papers/mapping-ontologiesinto-cyc_v31.pdf
Common Sense Reasoning – From Cyc to Intelligent Assistant
 http://www.csee.umbc.edu/691s/papers/FromCycToIntellige
ntAssistant-IJHCS-LNAI3864.pdf
CycL is Cyc’s language
"Bill Clinton belongs to the collection of U.S. presidents" and
(#$isa #$BillClinton #$UnitedStatesPresident)
"All trees are plants".
(#$genls #$Tree-ThePlant #$Plant)
"Paris is the capital of France."
(#$capitalCity #$France #$Paris)
"a fact about sets"
(#$implies
(#$and
(#$isa ?OBJ ?SUBSET)
(#$genls ?SUBSET ?SUPERSET))
(#$isa ?OBJ ?SUPERSET))
Download