ppt objects

advertisement
Object and Face Recognition:
Lecture Notes
Introduction
• These notes are in 2 parts, roughly corresponding
to (1) object recognition; and (2) face recognition.
• We'll actually make a smooth transition by
discussing whether face recognition is a special
kind of recognition, or whether it can be described
in terms of more general abilities.
1. Early Models of Recognition
• Template-matching. Early models of recognition
were based on matching the stimulus pattern to a
set of stored, pre-defined 'templates'. These
models can be expanded by incorporating some
pre-processing
• Feature analysis. Instead of looking for a match
to a standard shape, feature analysis emphasizes
that what makes an 'A' an A is the letter's unique
set of defining features
2. Failures of Matching-Models
There are several key weaknesses of these early
approaches to recognition:
• 3 dimensions. The template-matching and feature
analysis models were designed mostly to recognize
alphanumeric characters
– However, most real-world human recognition occurs on
3D objects, not 2D symbols
• Superficial differences. It's hard for these models
to 'extract' the basic similarity of repeat instances of
an item while also representing superficial
differences (consider handwriting as an example)
Where to start?
• These models don't have a clear means for
breaking up complex input into a series of
objects-to-be-identified; instead, they seem
to rely on being 'fed' items (e.g., letters) to
recognize
• That seems utterly unlike human perception
3. Building Blocks of
Recognition?
• In order to simulate human recognition in a
complex 3D world, more recent models of
object recognition have relied on defining
constraints (or "primitives") and basic
strategies that the visual system might use
Marr and Nishihara
• Marr and Nishihara present a model of
recognition, restricted to the set of objects
that can be described as generalized cones-objects with a clear main axis and a
constant-shape cross section
Recognition occurs in the
following levels:
• Single-model axis. The first step in the model is
identification of the main axis of the object
• Component axes. Then, the axes of each of the
smaller sub-portions of the object are
identified.[see steps 1 and 2]
• 3D model match. Finally, a match between the
arrangement of components and a stored 3D
model description is performed to identify the
object. [see step 3-- the catalog]
Steps 1 & 2 of Marr and Nishihara’s theory:
Step 3: The catalog (Marr & Nishihara)
Problem
• Although object comparisons are fastest if
the main axis of an object is the same as the
object it is being judged against, there is not
strong data supporting the 'psychological
reality' of the Marr & Nishihara model
(although it might be a very good way to get
computers to identify objects)
Geons (Biederman)
• According to geon theory, complex objects are
made up of arrangements of basic, component
parts ('geons')
• Some evidence for the psychological reality of
geons comes from experiments that show that
recognition is impaired when we hide the
intersections of geons
• However, occluding these intersections also
changes the picture in many other ways
Object with geon intersections occluded
Object with geon intersections revealed
4. View-Dependent Recognition?
• An alternate theory of recognition is simply
that you store a set of characteristic views
(mental images) of each object
• You recognize each object by finding the
closest match
• Note that this type of model does not
require you to infer/recover 3D shape in
order to identify objects
Experimental Evidence
• Bulthoff & Edelman trained people to learn
the shapes of weird objects that they made
up
• They tested them on identification of the
objects shown from (1) the same views as
during training; and (2) different views
• Performance was much better for the
training (familiar) views
5. Perceptual Organization
• We’ve switched from talking about how the visual
system extracts important features of the
environment-- edges, colors, motion, depth, etc-to how the visual system identifies objects (which
are made up of these key features)
• The goal of understanding "perceptual
organization" is to understand how we put
together the basic features to see a coherent,
organized world of things and surfaces
Gestalt
• The basic motto of the Gestaltists-- the first
real "school" of perceptual organization
theory-- was that the whole is greater than
the sum of its parts
• Examples: a melody made a various notes;
or see an example from the pointillist
painter Paul Signac
Figure & Ground
• The first step in organizing the perceptual
world is to divide figure and ground;
ambiguous figures demonstrate this
• Notice that you can perceive either the
white or dark portion of the picture as figure
or ground, but that it's very difficult to
perceive both as figure or both as ground
simultaneously
Figure & Ground
Gestalt laws of organization:
• Proximity. Elements that are close are often
grouped together
• Similarity. Similar elements tend to be grouped
together
• Common fate. Things that move together tend to
group
• Good continuation. Perceptual mechanisms tend
to preserve smooth continuity in favor of abrupt
edges
• Closure. Of several possible perceptual
organizations, ones yielding "closed" figures are
more likely than those yielding "open" ones.
• Orientation & symmetry. Objects oriented with
horizontal and vetical axes, or ones that are
symmetric, are more often perceived as figures
Demonstration of proximity & similarity
Demonstration of good continuation (your perception of the
objects on the left) and not good continuation (right)
Figure/ground changes with orientation and closure
The law of Pragnanz:
• "Of several geometrically possible
organizations that one will actually occur
that possesses the best, simplest, and most
stable shape" (Koffka, 1935)
• But what's the "best, simplest, most stable"?
• Do you think that the Gestalt laws of
organization provide a sufficient answer?
Illusions
• Sometimes, your visual system must
group/segment an object, not based on
edges in the retinal image, but by inferring
parts of the missing boundaries
• Consider illusory contours; and also note
that such edgeless contours can occur in the
real world, too
Illusory contour. Note that there is no physical contour for
most of the perceived "edge"
Illusory contour in real life
Impossible Figures
• Certain figures can be drawn so that there is
no consistent interpretation, due to the fact
that there is no obvious real-world physical
object that could correspond to the image
• However, "impossible objects" can be built
so that they look like an impossible figure
from a certain vantage point
Impossible figure
Impossible object
Impossible object, revealed
Failures of Grouping
• Some figures do not yield a consistent
grouping, and so the visual system struggles
to arrive at a solution, and ends up
switching back and forth between several
possible groupings
Multiple possible groupings
Download