region becomes figure 5. Nearer region becomes figure

advertisement
Cognition and Perception
• This is not a rose.
This is not a pipe.
“Just try stuffing tobacco in it!”
– Rene Magritte, 1930
The myth of vision as a faithful record
·
·
·
Concentric circles or continuous spiral?
The pattern of light is of concentric circles
Human vision sees a continuous spiral
Gestalt
• The whole is greater
than the sum of its parts
• Law of Pragnanz (“good
figure”): We perceive
things in the way that is
simplest to organize
them into cohesive and
constant objects.
Gestalt Laws
•
•
•
•
•
•
Laws of Figure-Ground Segregation
1. Convex region becomes figure
2. Smaller region becomes figure
3. Moving region becomes figure
4. Symmetric ("good") region becomes figure
5. Nearer region becomes figure (multiple
depth cues apply)
Gestalt Laws
Laws of Grouping
• 1. Proximity
• 2. Similarity
• 3. Common fate
• 4. Good continuation
• 5. Closure/ convexity
• 6. Common region
• 7. Connectedness
• 8. Parse regions at deep
concavities
• Common Fate
• http://dragon.uml.edu/psych/commfate.html
Figure 1. A: Kanizsa figure. B: Tse’s volumetric worm. C: Idesawa’s spiky
sphere. D: Tse’s sea monster
Gestalt Laws
Laws of Grouping
Closure/ convexity
The Myth of vision as a passive process
• The Grand illusion of complete perception
– (1) Vision is not rich in detail
• the size of a thumbnail at arm’s length is all that gets
processed
– (2) Attention is limited: the law of ONEs
• vision sees one object, one event, one location
• These two factors are illustrated by
– Impossible triangle
– Escher drawings
– Bistable images
Brains construct a well-behaved 3-D world so we cannot experience a
world that is not. Here we see an ordinary triangle and building with
normal corners and angles instead of the shocking reality. Why?
·
A perceptually ambiguous wire
cube
·
How many different
interpretations can you see?
Go to:
http://mindbluff.com/necker.htm
Figure 1.5. “Subjective” perceptions are not necessarily “arbitrary”
perceptions
Brains see two instead of
all of these
interpretations? Why
not?
Humans bring shared
assumptions to the vision
project, (1) that objects
are generally convex, (2)
that straight lines in a
picture represent straight
edges in an object, and
(3) that three-edge
junctions are generally
right-angled corners.
Bi-stable Images
Bi-stable Images
Law of One in Audition
• Shepard Tone
• http://www.youtube.com/watch?v=DfJa3I
C1txI
• Each square in the figure indicates a tone,
any set of squares in vertical alignment
together making one Shepard tone. The
color of each square indicates the loudness
of the note, with purple being the quietest
and green the loudest. Overlapping notes
that play at the same time are exactly one
octave apart, and each scale fades in and
fades out so that hearing the beginning or
end of any given scale is impossible.
Demos
• Charlie Chaplin mask demo
– http://www.youtube.com/watch?v=QbKw0_v2clo&feature
=related
• Visual Illusions
– http://www.michaelbach.de/ot/
• Moving random dot stereogram
– http://dragon.uml.edu/psych/commfate.html
– Spinning silhouette
• http://www.youtube.com/watch?v=uBTvKboX84E
• Gestalt Illusions
– http://www.opprints.co.uk/gallery.php
Object Recognition
• Mike the blind guy given sight
•
http://www.youtube.com/watch?v=VVgfC_FV2hI&feature=PlayList&p=32BC95C
9D7E5959C&index=1
Object Recognition
(Called Pattern Recognition in Book)
• How do you solve problem of Object
Constancy?
– How does the brain know the objects are the same
despite change in perspective?
What letter are these, and how do you
know?
A
A
A
A
Object Recognition
Receptive Fields of cortical neurons—
Primary Visual cortex
• 1. Simple Cells
--respond to points of light or
bars of light in a particular
orientation
• 2. Complex cells
--respond to bars of light in a
particular orientation moving in a
specific direction.
3. Hypercomplex Cells:
respond to bars of light in a
particular orientation, moving in
a specific direction, & of a
specific line length.
What is the organization of the visual
cortex?
• Hubel & Wiesel found
that the visual cortex is
organized into columns.
• Location specific: For
each place on the retina
there is a column of cells
in cortex.
• Two columns next to one another in the cortex respond
to stimulation of two adjacent points on the retina.
Spatial Frequency
• These grids are low
to high spatial
frequencies.
• Many light bars /
square = High S.F.
• Few light bars /
square = Low S.F.
• Part of vision’s
organization
Spatial Frequency
• By playing with spatial
frequency, you can
induce a the intense
luminance perception of
a bright sun.
Spatial Frequencies Work Together
• Low S.F. give you outlines, High give you details.
• Broad spectrum give you Local and Global features
Bottom-Up Processing
– Perception comes from the stimuli in
the environment
– Parts are identified, put together, and
then recognition occurs
– Context does not matter
Gibson’s Direct Perception
(Bottom-Up)
•
•
•
All the information needed to form a perception
is available in the environment
Perception is immediate and spontaneous
Affordances and attunements
–
–
Perception and action cannot be separated
Action defines the meaningful parameters of
perception and provides new ways of perceiving
Top-down Processing
• Perception is not
automatic from raw
stimuli
• Context is needed to
build perception
• Meaning is constructed
by making inferences,
guessing from
experience, and basing
one perception on
another
Context helps us to be able to recognize letters in many different
styles.
Context helps us to be able to recognize letters in many different styles.
Context helps us to be able to recognize letters in many different styles.
Template Theory:
Perception as a Cookie Cutter
• Basics of template theory
– Multiple templates are held in memory
– Compare stimuli to templates in memory for one
with greatest overlap until a match is found
Search memory for a match
See stimuli
Template Theory
• Weakness of theory
– Problem of imperfect matches
– Cannot account for the flexibility of pattern
recognition system
– More problems…
Search for match in memory
See stimuli
No perfect match in memory
Template Theory
• More Weaknesses of theory
– Comparison requires identical orientation, size,
position of template to stimuli
– Does not explain how two patterns differ
• e.g., there’s something wrong with it this, but I can’t put
my finger on it – AHA! I see!
Feature Theories
• Recognize objects on the basis of a small
number of characteristics (features)
– Detect specific elements and assemble them into
more complex forms
– Brain cells that respond to specific features, such as
lines and angles are referred to as “feature
detectors”
Two Feature Theories of Object Recognition
• Recognition By Components (Biederman; Marr)
vs.
• View-Based Recognition (Tarr; Bülthoff)
Superquadratics (Pentland, 1986)
Geons (Biederman, 1987)
Generalized Cylinders
(Binford, 1971; Marr, 1982)
• Recognition By
Components (Biederman)
– Basic set of geometrical
shape
• Geons (“geometric” + “ions”)
• Distinguishable from almost
any viewing angle
• Recognizable even with
occlusion
– “Grammatical” relationship
b/w parts
– Part-whole hierarchies
Evidence of Geons
•Beiderman (1987)
Can you identify these objects?
These objects have been rendered unidentifiable
because their geons are nonrecoverable
Evidence of Geons
• Beiderman (1987)
• Can you identify these objects?
These objects have had the same amount of the object taken
out but because the geons can still be recreated, one can
recover the objects
Testing Biederman
• Objects are
decomposed
– Omitting Vertices
– Retaining Vertices
• In accordance with
theory, easier to
identify object with
vertices
Object Recognition
• Pros
– Explains why it can be hard to recognize familiar
objects from highly unusual perspectives
• Cons
– Absence of physiological evidence
– Does not explain expert discriminations or quirks
of facial recognition
Marr’s Computational Approach
• Primal Sketch: 2-D description includes
changes in light intensity, edges, contours,
blobs
• 2 ½ -D Sketch: Includes information about
depth, motion, shading. Representation is
observer-centered
• 3-D Representation: A representation of
objects and their relationships, observerindependent.
View-Based Recognition
• Tarr; Bulthoff
– Multiple stored views of objects
– Viewer-centered frame of reference
– Specific views correspond to specific patterns of
neural activation (possibly involves “place
neurons”)
– Match b/w current and stored pattern of activation
– Interpolating (“educated guessing” or impletion)
b/w seen and stored views
The End
Opponent Process in a Movement
Illusion: Waterfall Effect
• http://video.google.com/videoplay?docid=6294268981850523944&ei=r5P
RSNGPD6fcqAPS48y6Ag&q=spiral+visual+illusion&vt=lf&hl=en
• http://video.google.com/videoplay?docid=2927422796086500362&vt=lf&hl=en
Cognition and Perception
• The finished files are the result of years of
scientific study combined with the experience
of many years.
• The finished files are the result of years of
scientific study combined with the experience
of many years.
Two Visual Systems
What your hands see differs from what the eyes see
• Ventral ‘What’ system
• Dorsal ‘Where/ How’ system
• Brain lesions
– Ventral lesions: patients cannot name telephone
but mime using it
– Dorsal lesions: can name it, but reach in wrong
direction for it
• Roelofs Effect
X
X
X
X
X
X
X
X
X
Top-Down & Bottom-Up
Orientation & Ocular Dominance
columns in Primary Visual Cortex
Simple Cells
Complex Cells
What is a receptive field of retinal
ganglion cells?
• The receptive field for these cells is the
region of the retina that, when stimulated
excites or inhibits the cell’s firing pattern.
The Visual cortex has a retinotopic
map
• Visual cortex has a map of the retina’s surface.
• More cortical neurons are devoted to fovea of
retina.
• As fovea only has cones, they are widely mapped
on cortex’s surface.
• The reason: cones allow us to see detail & color.
Spatial Frequency in Action
• http://www.metacafe.com/watch/1749277/animated_optical_illusions/
Top-down Processing Evidence
• Context effects
Context helps us to be able to recognize letters in many different
styles.
Context helps us to be able to recognize letters in many different styles.
Context helps us to be able to recognize letters in many different styles.
Theories
•
•
•
•
•
Template Matching
Prototype
Feature Matching
Object-Based
Viewer-Based
Change Blindness
• Counter experiment: http://www.youtube.com/watch?v=mAnKvo-fPs0
• Campus Door Demo:
• http://viscog.beckman.uiuc.edu/flashmovie/12.php
• Construction door http://viscog.beckman.uiuc.edu/flashmovie/10.php
Gradual Change:
http://viscog.beckman.uiuc.edu/flashmovie/1.php
Prototype Theories
• Modification of template matching (flexible
templates)
• Possesses the average of each individual
characteristic
• No match is perfect; a criterion for matching is
needed
Prototype Evidence
• Franks & Bransford (1971)
– Presented objects based on prototypes
– Prototype not shown
– Yet participants are confident they had seen
prototype
– Suggests existence of prototypes
Prototype Evidence
• Solso & McCarthy (1981)
– Participants were shown a
series of faces
– Later, a recognition test was
given with some old faces, a
prototype face, and some new
faces that differed in degree
from prototype
Solso & McCarthy (1981) Results
• The red arrow
notes that
participants were
more confident
they had seen the
prototype than
actual items they
had seen
Research on Prototypes
• Researchers have found that prototypical faces
are found to be more attractive to participants
• Halberstadt & Rhodes (2000)
– Examined the impact of prototypes of dogs,
wristwatches, and birds on attractiveness of the
stimuli
– Results indicate a strong relationship between
averageness and attractiveness of the dogs, birds,
and wristwatches
Feature Evidence
• Hubel & Wiesel (1979) using single cell
technique
– Simple cells detect bars or edges of particular
orientation in particular location
– Complex cells detect bars or edges of particular
orientation, exact location abstracted
– Hypercomplex cells detect particular colors (simple
and complex cells), bars, or edges of particular length
or moving in a particular direction
• Selfridge’s (1959) Pandemonium Model of visual
word perception where “R” is the target letter.
• Feature net model by
Rumelhart and
McClelland (1987), this
is an Interactive
Activation Model, which
means lower and higher
layers can both inhibit
and excite each other,
providing a mechanism
for both top-down and
bottom-up effects.
• Biederman: Stage 1, extract appropriate geon from image, and
stage 2, match to similar representation stored in long-term
memory.
• Biederman proposed that certain properties of 2-D images are
non-accidental, representing real properties in the world.
Viewer Based Recognition
• Physiological evidence
• Explains behavioral evidence
• Does not explain how novel objects are learnt
Download