An Enquiry into Computer Understanding Peter Cheeseman Computational Intelligence 4 (1) pp58-66

advertisement
An Enquiry into Computer
Understanding
Peter Cheeseman
Computational Intelligence 4 (1) pp58-66
(1988)
Cheeseman’s Thesis
• McDermott (1987)
– Common-sense reasoning is not logical, but plausible
• E.g. drop a glass: will it break?
– The logicistic approach is fatally flawed
– Thus, AI is doomed to “procedural ad hocery”
• Cheeseman (here)
– No! Bayesian probability removes these difficulties…
– and is theoretically sound…
– and will also account for inductive inference!
“Only the myopic belief – that logic (and its underlying semantics
of `truth’) is the possible language for describing the world –
could have lead AI researchers into shoehorning all reasoning into
the logical mold whether it fitted or not.”
The Argument (1)
• Common-sense reasoning is non-monotonic
• Non-monotonic logics don’t work
– “default” conclusions are just as “true” as logically sound
ones, so can’t distinguish
• Which conclusions to revise
• How strongly conclusions should be believed
• Bayesian reasoning
– is monotonic
– But captures non-monotonicity by using conditional probs
• P(milk off | 3 days old) = 0.1
• P(milk off | 3 days old & smelly) = 0.95
– Extra info doesn’t invalidate old statements
– There’s no “real” prob: it’s all subjective
– Unlike logic, where truth is unconditional
The Argument (2)
• In logic, degrees of truth are restricted to 1 or 0 (t or f)
• So how can we say “unlikely the glass will break”?
– In logic it must break or it’s impossible to break
Real-world knowledge is rarely of this categorical form –
why anyone in AI ever thought it is, I will never understand.
Why Bayesian reasoning?
Cox’s requirements:
•
•
•
•
Propositions well-defined
A single number is necessary & sufficient for belief
All propositions have a unique degree of belief
Degree of belief can depend on other degrees of belief
– (violated by fuzzy sets & Dempster-Shafer)
• Need to calculate belief(p1…pN) given belief(p1),..,belief(pN)
• As belief(p) , belief(p) 
• Equal belief in p1, …, pN if p1,…,pN have same truth value
Bayes is the answer!
Induction
• Bayes: find most likely hypotheses from priors + data
• Can use this for induction & clustering
• Basic premise: gives a principled way of ranking
theories from observations
– “If Bayesian inference is the solution, why isn’t there a pile of
papers on it in AI? Ignorance? Belief that numbers  AI?”
• Popper: Theories can only be proved incorrect 
• Cheeseman: Theories are only more or less plausible
depending on the priors and data 
More against logic…
• “if A then B” (English)  “A  B” (logic)
• E.g. “if there’s smoke, there’s fire”
– x smoke(x)  fire(x) ?
– x smoke(x)  fire(x) ?
• Raven’s paradox:x raven(x)  black(x)
[1]
– Black ravens should increase belief in this
– But so should non-black non-ravens!, as [1] =
x black(x)  raven(x)
• Bayes doesn’t have these problems
• How else can we represent “all ravens are black”?
Even more against logic…
In logic, it is sufficient to find a chain of inference from the
premise to the conclusion to be able to establish its truth.
Additional chains of inference to the same conclusion are
redundant. However, in Bayesian reasoning, all evidence that
is relevant should be used in making a probability
assessment….
The ability to combine information from multiple sources
and balance different contributions is lacking in logic.
Discovery of new information leads to a revision of past
beliefs. Probabilities give a measure of how much the new
information alters our beliefs.
The Counter-Arguments…
Priors & The Independence
Assumption
• Aleliunas: Where do the priors come from?
– Bayes’ hunger for probabilities makes it
generally infeasible
• Bundy: Bayes is only proof-functional (A and B 
A & B) by making horrendous independence
assumptions
– P(A) = .5, p(B) = .5, p(A&B) = 0 to 0.5 …yuk!
• Supose Fred & Sue & Joe & … all say Mike is tall
 Mike is 7’ tall for sure?
Where do the models come from?
Dempster:
Bayes is essentially a calculus for deducing posterior
probabilities and expectations from specified models.
Neither Fisher (1950) nor Cheeseman tell us where the
formal models come from in the first place.
Israel:
Non-monotonicity isn’t just updating probabilities; it’s the
much broader problem of revision or defeasibility, =
specifying principles of rational belif change under the
pressure of new info. Revision has to do with global
principles of governing change of the total cognitive state.
Where do the models come from?
(cont)
• Kanal & Perlis: Bayes isn’t so defeasible:
– Bayes’ assumes a fixed model: rules p(H|E) fixed
– We can only vary belief in the inputs to the model
– Doesn’t account for “rule revision” e.g. I no longer
believe that E indicates H and more
(Not) representing imprecision
Dubois & Prade:
• Cox’s “axioms” rule out imprecision: Cheat!!
• Bayes can’t distinguish uncertaing vs. ignorance
– Roll a dice: p(6) = 1/6
– After 1000 tries, we’re more sure it’s unbiased, so
our ignorance is less. But p(6) = 1/6 still.
• Sometimes you need to know the error bounds
before acting
Probabilities don’t solve everything!
• Kanal & Perlis: “Cat isa animal”
• McDermott: Put this in probabilities, pal:
“My wife teaches music. A student who’d borrowed some music
called to arrange to return it Monday. My wife suggested just
keep it until your next lesson on Thursday. But she returned it
Monday anyway. Then Wednesday she called saying she was sick
and couldn’t come Thursday. My wife suspected she was lying.”
– p(bring music back | planning not to show for lesson) = ??
• Pearl: Does Yale Shooting Problem with Bayes
What do we Condition on?
• Dalkey, Schafer:
– p(black|raven) – How we we characterize the
population “raven”? Major problem.
• Dempster:
– P(morning star | F) …. What is F?
– F = my whole state of knowledge
– Isn’t this rather big? And continously changing?
• McDermott:
The real problem is deciding what evidence to look at in the first place,
given the intractability of taking everything into account.
What are Probabilities Anyway?
• Hayes, Schafer: What does “probability” mean
anyway?
– Objective: Statistics over a population?
– Subjective: e.g. betting odds?
– Degree of belief??
Against Numbers…
• Greiner:
– Would prefer ATMS, but have meta-rules to
resolve conflicts and rank arguments, rather
than numbers.
• Kanal & Perlis:
– Logical reasoning is an integral part of
common-sense
Download