Cyc Patrick McCauley CMSC 691S - Semantic Web Spring 2009 Why do we need Cyc? Evolution of rule-based expert systems Rule-based systems are brittle MYCIN - diagnosis of blood infections DENDRAL - chemical analysis Cannot detect typos Only work within a specific domain Brittleness due to lack of “common sense” What is Cyc? All apps can benefit from common sense Began in 1984 by Doug Lenat Initial goals: KB and Ontology Building (pump priming) NL Understanding / Interactive Dialog Vocab 101 Knowledge - underlying heuristics that allow us to reason Data - Facts or statements about specific items in the world Where to start? How much does a system need to know in order to be useful? What kinds of knowledge are necessary? How should this knowledge be represented? Priming the Pump Need to encode basic, common sense knowledge “representing human consensus reality” - insulting to state these facts to another person E.g. “You have ten fingers.” Assumes ten is a number and that a person has a specific number of fingers. E.g. “Cardinals are red.” Assumes that cardinals are a type of bird and that birds have feathers which, in this case, are red. Also assumes “red” is a color. As data grows, so do inconsistencies Too much data gives rise to inconsistencies Microtheory Internally consistent data module Explicitly represented logical context Cyc knows or is told which MTs should be used to solve a problem Cyc KB Topic Map How to represent knowledge? CycL - augmented FOPC Each assertion in the KB carries a “truth value” Monotonically false Default false Unknown Default true Monotonically true What about external data? SKSI - Semantic Knowledge Source Integration CycL used to describe external DB columns Cyc I/O Natural Language Processing Extremely difficult since human speech/language is ridiculously complex Written text often violates proper grammar, but its meaning is understood by humans Fred saw the plan flying over Zurich. Fred saw the mountains flying over Zurich. CycNL to the rescue! Lexicon - “contains syntactic and semantic information about English words” Relationships between English words and Cyc constants are stored CycNL - Syntactic Parser Uses a phrase-structure grammar, context free rules Builds multiple tree structures for each phrase/sentence However, some trees do not make “syntactic” sense CycNL - Semantic Interpreter Transforms results into CycL formulas Result is “pure” CycL How is this useful to Humans? Ambient Research Assistant flexibility and ease of communication are key Must be capable of “learning” Deciding what facts to learn Learning those facts Learning of rules Generalizing rules Testing and revision Benefits of Assistant Capable of searching much faster than humans Availability - supercedes the 9-to-5 “Truly Intelligent” Assistant Plan Recognition Learning NL Acknowledgements CYC Website http://www.cyc.com/ CYC: A Large-Scale Investment in Knowledge Infrastructure http://www.csee.umbc.edu/691s/papers/cyc95.pdf Mapping Ontologies into Cyc http://www.csee.umbc.edu/691s/papers/mapping-ontologiesinto-cyc_v31.pdf Common Sense Reasoning – From Cyc to Intelligent Assistant http://www.csee.umbc.edu/691s/papers/FromCycToIntellige ntAssistant-IJHCS-LNAI3864.pdf CycL is Cyc’s language "Bill Clinton belongs to the collection of U.S. presidents" and (#$isa #$BillClinton #$UnitedStatesPresident) "All trees are plants". (#$genls #$Tree-ThePlant #$Plant) "Paris is the capital of France." (#$capitalCity #$France #$Paris) "a fact about sets" (#$implies (#$and (#$isa ?OBJ ?SUBSET) (#$genls ?SUBSET ?SUPERSET)) (#$isa ?OBJ ?SUPERSET))