Vision and reality - School of Computer Science

advertisement
G52HPA:
History and Philosophy of Artificial
Intelligence
Lecture 6: Vision & Reality
Tony Pridmore & Natascha Alechina
School of Computer Science
{tpp,nza}@cs.nott.ac.uk
Outline of this lecture
• What is Vision?
• Early days - blocks world semantics
• David Marr - representations and assumptions
• The cycle of perception
• Probabilistic models
• Illusions
• Relation to philosophy
G52HPA Lecture 5: Vision & Reality
2
What is Vision?
• “to know what is where by
seeing” – Aristotle
• Extraction of symbolic
descriptions of the viewed
environment from an individual
or sequence of images
– not image processing
– image processing takes an
image and produces a
(hopefully) better one,
usually for human
consumption
Red:
33
Green: 11
Blue: 14
G52HPA Lecture 5: Vision & Reality
3
What is Vision?
• Backprojection: the value of a pixel depends upon
– illumination
– viewpoint
– surface reflectance
– surface geometry
“Vision is putting the toothpaste back in the tube” – John Mayhew
G52HPA Lecture 5: Vision & Reality
4
What is Vision?
• At a higher level, the goal is to provide semantic information about the
viewed world
• Requires a transformation (implicit or explicit) to a world, as opposed
to camera, coordinate frame
• Work at this level usually involves time-varying image sequences
G52HPA Lecture 5: Vision & Reality
5
Early days – blocks world semantics
• Initially considered to be easy, Marvin Minsky famously hired
Sussman (1st year ug) to build a vision system over the summer
• Attempt with real camera, etc quickly lead AI community to retreat to
the “core” problem; extracting semantics
• Also restricted attention to a toy environment; the blocks world
– Polyhedral (trihedral) solids
– Line drawings not real images
– No illumination effects, texture, shading
– Goal is to interpret in terms of surfaces
and objects
G52HPA Lecture 5: Vision & Reality
6
Early days – blocks world semantics
• Labelling approach – list possible interpretations of each line (convex,
concave, etc), aim to produce a consistent labelling of the drawing
• Guzman’s (1968) first attempt was a hack
• Huffman & Clowes identified core labels in early 1970s
• David Waltz added shadows, and invented relaxation labelling in 1975
G52HPA Lecture 5: Vision & Reality
7
David Marr
• Physiologist and mathematician
• Argued (1977) for study of real vision systems
• To understand vision, or any component of AI,
requires
Computational theory
what is computed and why?
– Algorithm
how is it computed?
– Mechanism
what is it computed on?
G52HPA Lecture 5: Vision & Reality
8
David Marr
• Marr proposed a representational framework that moved steadily from
the image towards higher level interpretations
G52HPA Lecture 5: Vision & Reality
9
David Marr
• Marr emphasised computational theory, particularly the clear
statement of the assumptions made, because
– Vision is impossible without prior knowledge
– Many possible, but hugely unlikely worlds could generate any
image
– We only get a clear, instantaneous and usually unambiguous
percept because of the assumptions our visual system makes
• Lead to work in the late 70’s/early 80’s
on computational theories for
– Binocular stereo
– Motion recovery (optic flow)
– Shape from texture
– Shape from shading…
G52HPA Lecture 5: Vision & Reality
10
The 1980s: Expert Systems and the Cycle
of Perception
• Marr’s representational framework was essentially linear
• Recognition of the role of prior knowledge and the ES boom lead to a
large body of work on knowledge-based vision
– Rule-based systems
– Blackboard architectures
– Greater emphasis on domain
knowledge
• Most implicitly adopt Neisser’s
(1976) cycle of perception architecture
Results now explicitly depend on prior
knowledge
G52HPA Lecture 5: Vision & Reality
11
Modern Computer Vision: Probabilistic
Models
• Computer vision now recognises
– the importance of prior knowledge
– that there is no unique solution to any problem, only a set of
possible solutions with associated probabilities
• Probabilistic models now dominate, from low level image
segmentation to high level event recognition
– Expectation maximisation (EM) algorithm
– Hidden markov models
– Bayesian filtering to track moving objects
G52HPA Lecture 5: Vision & Reality
12
Modern Computer Vision: Probabilistic
Models
1
2
x
3
P(1  x )
P(2  x )
P(3  x )
e.g. Segmentation using EM
G52HPA Lecture 5: Vision & Reality
13
Illusions as Evidence
The Ames Room
The Ponzo Illusion
G52HPA Lecture 5: Vision & Reality
14
Illusions as Tools
• Local vs global
• High-level vs
low-level
G52HPA Lecture 5: Vision & Reality
15
Relation to Philosophy
• Perception is not a passive, but an active process
– as true of e.g. language understanding as vision
• We are rarely aware of it, it may be an 'art concealed in the depths of the
human soul‘ (Kant)
• In Critique of Pure Reason, Kant suggested that
– all our empirical knowledge is made up of both 'what we receive
through impressions‘ and of what 'our own faculty of knowledge
supplies from itself
• This casts doubt on both the representations we have of the world and
the validity of any reasoning and/or planning we do over them
G52HPA Lecture 5: Vision & Reality
16
Download