Ch 5 Perceiving objects

advertisement
Sensation & Perception
Ch. 5: Perceiving Objects and Scenes
© Takashi Yamauchi (Dept. of Psychology, Texas A&M University)
Main topics
The challenge of object perception
Inverse projection problem
Gestalt psychology
Perceptual organization
Figure-ground segregation
Biederman’s Recognition byChComponents
theory
5
1
Minsky’s estimation
• Why is it so difficult for the computer to
recognize objects as we do?
• http://web.media.mit.edu/~minsky/
Ch 5
2
What AI can do
• 50th anniversary in 2006 (from Rodney
Brooks)
– The most advanced AI can barely beat a 2year-old child in object recognition.
Ch 5
3
Why AI betrayed?
• Recognizing objects
– is much more than just detecting lines, colors,
or shapes.
– requires a large amount of background
knowledge.
•  both implicit and explicit knowledge
Ch 5
4
The challenge of object perception
• The stimulus on the receptors is ambiguous
– Inverse projection problem: an image on
the retina can be caused by an infinite
number of objects
• Objects can be hidden or blurred
– Occlusions are common in the
environment
Ch 5
5
Figure 5.3 The principle behind the inverse projection problem. The small
square stimulus creates a square image on the retina. However, this image
could also have been created by the other two shapes and many other stimuli.
This is why we say that the image on the retina is ambiguous.
Ch 5
6
Ch 5
7
Fig. 5-4, p. 96
The Challenge of Object
Perception - continued
• Objects look different from different
viewpoints
– Viewpoint invariance: the ability to recognize
an object regardless of the viewpoint
• The reasons for changes in lightness and
darkness in the environment can be unclear
Ch 5
8
Ch 5
9
Fig. 5-6a, p. 96
Ch 5
10
Fig. 5-8, p. 97
Part + Part = Whole?
• Gestalt psychology
Ch 5
11
Group discussion
• Given the pictures shown in the slides, write
down everything you see
Ch 5
12
Fig. 1.
Ch 5
13
Fig. 2
Ch 5
14
Fig. 3
Ch 5
15
Fig 4
Ch 5
16
Fig. 5
Ch 5
17
Fig. 7a
Fig. 7b
Fig. 7c
Ch 5
18
Three Musicians. P. Picasso
Ch 5
19
Woman with a book: P.
Picasso
Ch 5
20
La lectrice: P.
Picasso
Ch 5
21
Music. H. Matisse
Ch 5
22
Dance. H. Matisse
Ch 5
23
View of Toledo: El Greco
Ch 5
24
Baptism of Christ: El Greco
Ch 5
25
The Burial of Count Orgaz: El Greco
Ch 5
26
Metamorphosis of narcissus: S. Dali
Ch 5
27
War: S. Dali
Ch 5
28
Mont Sainte-Victoire (Cezanne)
Ch 5
29
Hokusai
Hiroshige
Ch 5
30
What do these pictures tell us?
• Perceiving things is a lot more than just
detecting lines, colors, motion, so on.
• We see
– Face, body, hands, apples, mountains, legs,
tables, houses, streets,……
– motion, movement, action, rhythm, expansion,
contraction, upheaval,
Ch 5
31
How come?
• Your eyes receive 2-D flat information.
• All you got are activation of neurons.
• How come you perceive faces, trees,
apples…?
Ch 5
32
Why & how do we see those?
• Perception involves
– Organization
• We see a woman
sitting on a sofa
because we organize
visual information in a
certain way.
Ch 5
33
Ch 5
34
How do we organize visual
information?
• Are there any principles behind it?
• What principles do we follow?
• Gestalt laws of organization
Ch 5
35
Tell me what you see.
Ch 5
36
Tell me what you see.
Ch 5
37
Perception is not just detection
• Perception is not just about detecting color
or shape.
• Perception is about organizing visual
information.
• How do we organize visual information?
Ch 5
38
Or when do we fail to organize
visual information?
Ch 5
39
Gestalt psychology
• “Gestalt” means “whole.”
• Organizational principles:
– Similarity
– Proximity
– Continuity
• And more (see the textbook)
Ch 5
40
Law of similarity
Similar things are put together
Ch 5
41
Ch 5
42
Items with similar
colors are put
together
Ch 5
43
Do you see a sofa?
Do you see 3 musicians?
Why? Why not?
Do you see a blouse and a
blue skirt?
Do you see a table?
Ch 5
44
Ch 5
45
Law of proximity
• Things that are close to each other are put together.
Ch 5
46
There are two
separate
worlds --- this
world and that
world?
Ch 5
47
Ch 5
48
Law of good continuity
We tend to put things together when they show nice
continuity.
Ch 5
49
Do you see a spinal motion? This
picture is called “Dance.” Do you
see
Ch 5
why?
50
Do you feel an
upward motion or a
downward motion?
Ch 5
51
Ch 5
52
• There are many many more ways to
organize visual information (see the
textbook).
Ch 5
53
Perceptual Segregation
• Figure-ground segregation
– determining what part of environment is the figure
so that it “stands out” from the background
Ch 5
54
Figure-Ground segregation
• Distinguishing a figure from the ground.
• We know the difference between the figure and
the ground
Ch 5
55
How does the perceptual system
distinguish the figure from the
ground?
Ch 5
56
Which one is the figure and which one is the ground?
Depending on the locus of attention, the figure and the
ground switch rapidly.
 Attention plays some role in determining the figure and
Ch 5
the ground.
57
The figure represents “some thing.”
The contours belong to the figure rather than to the
ground.
Ch 5
58
Which one if the figure and which is the ground?
This is easy.
The figure tends to have solid and continuous surface.
Ch 5
59
Which one is the figure and which one is the ground?
Symmetric items tend to be seen as a figure.
Ch 5
60
What did you see? Some arrows and what?
Meaningful items are seen as figure
Do you see something else?
Ch 5
61
Ch 5
62
Ch 5
63
What does this tell?
• Figure-ground segregation is influenced by
our knowledge about the world in general.
– What things are, how they look like,
– Two things can’t occupy the same space
simultaneously.
Ch 5
64
What does this tell?
• Gestalt laws of organization
– Similarity, continuity, proximity,..
• Figure-ground segregation
• Perceptual organization is based on our
world knowledge (what we know about the
world).
Ch 5
65
Ch 5
66
Ch 5
67
Ch 5
68
Ch 5
69
Ch 5
70
Ch 5
71
Ch 5
72
Ch 5
73
Input & output to and from LGN
Ch 5
74
Any idea?
•
•
Gestalt laws of organization
– Similarity, continuity,
proximity,..
Figure-ground segregation
• What do they tell you?
Some top-down processes are
going on.
Ch 5
75
Ch 5
76
Modern research on object
perception
• The perceptual system is tuned to capture
the regularities in the environment.
• Biederman’s Recognition by Components
model
Ch 5
77
How do we store objects in the
brain?
• See an object.
• Put it into your brain
• Take out the one you stored
earlier.
• compare it to what you
saw.
Ch 5
How do you
store what you
saw before?
78
You see billions of objects every
day
• Does your brain have enough space to store
them?
• Storing everything is ineffective (very
expensive).
• So, you got to do something different.
• Better way to store things?
Ch 5
79
How do you do that?
Ch 5
80
Biederman’s Recognition by
Components
• You just need about 36 components to
represent millions of different objects.
Ch 5
81
Biederman’s Recognition by
Components (RBC)
• Objects are described and stored by simple
geometric components (geons).
• There are about 36 geons.
• To represent objects, we use geons and their
arrangements.
Ch 5
82
Geons
Ch 5
83
Ch 5
84
Combinations of geons
Ch 5
85
Combinations of geons
Combining 4
geons can yield
more than 1
million objects.
(36x36x36x36)
Ch 5
86
Ch 5
87
Ch 5
88
Ch 5
89
Ch 5
90
Biederman & Ju (1988)
Cognitive Psychology, 20, 38-64
• Do people need information more than
geons provide?
• Contrast subjects’ performance for object
recognition when two types of pictures
(actual pictures and schematic pictures) are
shown one by one.
Ch 5
91
Schematic pictures depicted by
geons
Ch 5
92
Actual pictures
Ch 5
93
Experiment:
• The subject indicated the name of the object
shown on the screen.
• In one case, an actual picture of the object
was shown.
• In the other case, a schematic illustration of
the object (depicted by geons) was shown.
Ch 5
94
Questions/design:
• Do subjects name the objects more quickly
and accurately when actual pictures were
shown or when schematic pictures (depicted
by geons) were shown?
• Each picture (either actual/schematic)
flashed on the computer screen only for
50ms, 60ms, or 400ms.
Ch 5
95
Why 50ms? Or 400ms?
• The task needs to be not too easy but not
too difficult.
– Test college students’ math ability.
– test adding and subtracting?
– Test high school students’ math ability.
– Give quantum mechanics questions?
Ch 5
96
Results:
Error rates
Presentation 50(ms) 65(ms) 400 (ms)
5(%)
13
20
Line drawings
3(%)
9
35
Photographs
Response times (correct responses only)
Presentation 50(ms) 65(ms) 400 (ms)
860
870
Line drawings 990
850(ms)
875
Photographs 1010
Ch 5
97
Ch 5
98
Ch 5
99
The intelligence of human object
perception
• Why are humans much better than
computers at object recognition?
Ch 5
100
• Theory of unconscious inference
– Human object perception is like problem
solving.
• We make an unconscious inference.
– Likelihood principle
• objects are perceived based on what
is most likely to have caused the
pattern.
– Humans have a vast array of
knowledge (intelligence) that can
disambiguate ambiguous stimuli.
Ch 5
101
Ch 5
102
Ch 5
103
Ch 5
104
Ch 5
105
Ch 5
106
Ch 5
107
Ch 5
108
Ch 5
109
Ch 5
110
Ch 5
111
Ch 5
112
Ch 5
113
Ch 5
114
Ch 5
115
Ch 5
116
Ch 5
117
Ch 5
118
Ch 5
119
Ch 5
120
Ch 5
121
Ch 5
122
Ch 5
123
Ch 5
124
Ch 5
125
Ch 5
126
Download