Cognitive Processes PSY 334

advertisement
Cognitive Processes
PSY 334
Chapter 2 – Perception
Object Recognition
 Two stages:


Early phase – shapes and objects are
extracted from background.
Later phase – shapes and objects are
categorized, recognized, named.
Disruptions of Perception
 Visual agnosias – impairment of ability to
recognize objects.

Demonstrate that shape extraction and shape
recognition are separate processes.
 Apperceptive agnosia (lateral) – problems with
early processing (shape extraction).
 Associative agnosia (bilateral) – problems with
later processing (recognition).
 Prosopagnosia – visual agnosia for faces.
Tests for Apperceptive
Agnosia
Some patients would have
trouble drawing this chair due to
the missing contours.
Some patients would have
trouble recognizing a chair
from this perspective.
Tests for Associative Agnosia
The subject can copy the
anchor accurately (as
shown) but then cannot tell
you what it is.
Early Visual Processing
 Parts of the eye
 Two kinds of photoreceptors:


Rods respond to motion, light & dark
Cones respond to color, shape, detail
 Fovea is the area of the retina with
highest resolution – best for seeing
detail.

We move our eyes so light hits the fovea.
The Eye
Later Visual Processing
 Neural pathways from the eyes to the
visual cortex split at the optic chiasm.


Info from the left visual field goes to the
right hemisphere.
Info from the right visual field goes to the
left hemisphere.
 Two pathways from the visual cortex:


“Where” pathway
“What” pathway
Pathways to the Visual Cortex
Pathways Forward
Information Coding
 On-off cells in LGN feed into edge and
bar detectors in the visual cortex.
 Edge detectors – respond positively to
light on one side of a line, negatively on
the other side of the line.
 Bar detectors – responds maximally to a
bar of light covering its center.
Edge and Bar Detectors
Edge & Bar Detectors (Cont.)
Computer Edge Detection
Feature Maps
 In addition to edges, lines, bars, other
information is extracted from the visual
signal:


Color
Motion
 These aspects, called “features,” are
represented in feature maps located in
different areas of the brain.
Depth Perception
 Our eyes turn a three-dimensional world
into a two-dimensional image on the
retina.
 Our cortex turns that two-dimensional
image back into three-dimensions
(depth).
 Cues are used to infer distance.


Cues must be learned through experience.
Depth cues in art: http://psych.hanover.edu/KRANTZ/art/cues.html
Depth Cues
Optic flow
Nearer things move faster, farther things move slower
Size Constancy is Mental
The same photo
The same photo again
Marr
 Depth cues (texture gradient, stereopsis,
motion parallax) – where are edges in
space?
 How are visual cues combined to form
an image with depth?


2-1/2 D sketch – identifies where visual
features are in relation to observer.
3-D model – refers to the representation of
the objects in a scene.
Pattern Recognition
 Classification and recognition of objects
occurs through processes of pattern
recognition.
 Bottom-up processes – feature detection
 Top-down processes -- conceptually
driven processing
Top-Down Processing
Why do we see an H in the first word but an A in the second word?
Gestalt Priniciples
 Wertheimer, Koffka, Kohler.
 Form perception – segregation of a
display into objects and background.
 Principles of perceptual organization
allow us to see “wholes” (gestalts)
formed of parts.

We do not recognize objects by identifying
individual features.
Five Principles
 Proximity
 Similarity
 Good continuation
 Closure
 Common fate

Elements that move together group
together.
 These will be on the midterm.
Examples (Fig 2-13)
proximity
similarity
good
continuation
closure
Examples
• Gestalt principles of organization
•
http://psych.hanover.edu/Krantz/sen_tut.html
• Illusory contours:
http://psych.hanover.edu/JavaTest/Media/Chapter5/MedFig.IllusoryContour.html
• Reversible figures
•
http://www.psy.ritsumei.ac.jp/~akitaoka/reversiblee.html
• Apparent motion demos:
http://psy.ucsd.edu/~sanstis/SACamov.html
http://www.michaelbach.de/ot/mot_biomot/index.html
http://www.lifesci.sussex.ac.uk/home/George_Mather/BM_ECVP_2006.htm
Law of Pragnanz
 Of all the possible interpretations, we will
select the one that yields the simplest or
most stable form.
 Simple, symmetrical forms are seen
more easily.
 In compound letters, the larger figure
dominates the smaller ones.
Law of Pragnanz
People are more likely to see (b) and (c) not (d) or (e) in
figure (a)
Visual Illusions
 Depend on experience.

Influenced by culture.
 Illustrate normal perceptual processes.

These are not errors but rather failures of
perception in unusual situations.
 Try some yourself:

http://www.michaelbach.de/ot/
Visual Pattern Recognition
 Bottom-up approaches:



Template-matching
Feature analysis
Recognition by components
Template-Matching
 A retinal image of an object is compared
directly to stored patterns (templates).


The object is recognized as the template
that gives the best match.
Used by computers to recognize patterns.
 Evidence shows human recognition is
more flexible than template-matching:

Size, place, orientation, shape, blurred or
broken (ambiguous or degraded items
easily recognized by people.
Example from the Internet
Feature Analysis
 Stimuli are combinations of elemental
features.


Features are recognized and combined.
Features are like output of edge detectors.
 Features are simpler, so problems of
orientation, size, etc., can be solved.
 Relationships among features are
specified to define the pattern.
Features of Letters
Evidence for Feature Analysis
 Confusions – people make more errors
when letters presented at brief intervals
contain similar features:

G misclassified: as C (21), as O (6), as B
(1), as 9 (1)
 When a retinal image is held constant,
the parts of the object disappear:


Whole features disappear.
The remaining parts form new patterns.
Object Recognition
 Biederman’s recognition-by-components:



Parts of the larger object are recognized as
subobjects.
Subobjects are categorized into types of
geons – geometric ions.
The larger object is recognized as a
pattern formed by combining geons.
 Only edges are needed to recognize
geons.
Sample Geons
Biederman’s Stimuli
Tests of Biederman’s Theory
 Object recognition should be mediated
by recognition of object components.
 Two types of degraded figures presented
for brief intervals:


Components (geons) missing
Line segments missing
 At fast intervals (65-100 ms) subjects
could not recognize components when
segments were missing.
Biederman’s Results
Face Recognition
 Prosopagnosia – inability to recognize
familiar faces.
 Are faces special?




Thatcher effect
Damage to fusiform gyrus causes
prosopagnosia.
The area may also be used for fine-grained
distinctions needed to recognize faces but
also other objects.
Bird, car & greeble experts all use it.
The Fusiform
Face Area:
Identification of Faces
and Members of
Categories
Prosopagnosia
http://www.psy.vanderbilt.edu/faculty/gauthier/picts/mona_lisa.jpg
Thatcher Illusion (without
Thatcher)
Thatcher Illusion (Cont.)
Why did it look more normal when viewed upside down?
Greebles & Faces
Figure 4.24 (a) Greeble stimuli used by Gauthier. Participants were trained to name each different Greeble. (b)
Brain responses to Greebles and faces before and after Greeble training. (a: From Figure 1a, p. 569, from
Gauthier, I., Tarr, M. J., Anderson, A. W., Skudlarski, P. L., & Gore, J. C. (1999). Activation of the middle fusiform
“face area” increases with experience in recognizing novel objects. Nature Neuroscience, 2, 568-573.)
Speech Recognition
 The physical speech signal is not broken
up into parts that correspond to
recognizable units of speech.



Undiminished sound energy at word
boundaries – gaps are illusory.
Cessation of speech energy in the middle
of words.
Word boundaries cannot be heard in an
unfamiliar language.
Phoneme Perception
 No one-to-one letter-to-sound
correspondence.
 Speech is continuous – phonemes are
not discrete (separate) but run together.
 Speakers vary in how they produce the
same phoneme.
 Coarticulation – phonemes overlap.

The sound produced depends on the
sound immediately preceding it.
Feature Analysis of Speech
 Features of phonemes appear to be:



Consonantal feature (consonant vs vowel).
Voicing – do vocal cords vibrate or not.
Place of articulation – where the vocal
track is constricted (where is tongue
placed).
 The phoneme heard by listeners
changes as you vary these features.

Sounds with similar features are confused.
Categorical Perception
 For speech, perception does not change
continuously but abruptly at a category
boundary.
 Categorical perception – failure to
perceive gradations among stimuli within
a category.

Pairs of [b]’s or [p]’s sound alike despite
differing in voice-onset times.
Perception of /b/ versus /p/
/b/ versus /p/
Phoneme Perception Results
Notice
the
abrupt
shift
from /b/
to /p/
Two Views of Categorical
Perception
 Weak view – stimuli are grouped into
recognizable categories.
 Strong view – we cannot discriminate
among items within such a category.
 Massaro – people can discriminate
within category but have a bias to say
items are the same despite differences.
 Category boundaries can be shifted by
fatiguing the feature detectors.
Top Down Processing
Top Down Processing
 General knowledge (context, high-level
thinking) combines with interpretation of
low-level perceptual units (features).
 Context limits the possibilities so fewer
features must be processed:


Word superiority effect – D or K vs WORD
or WORK – words do 10% better.
To xllxstxatx, I cxn rxplxce xvexy txirx
lextex of x sextexce xitx an x, anx yox stxll
xan xanxge xo rxad xt wixh sxme xifxicxltx.
Word Superiority Effect
K
WORK
OWRK
K
D
Subjects seeing the letter K in the
context of a word did 10% better
than the other conditions.
Context and Speech
 Phoneme restoration effect:




It was found that the *eel was on the axle.
It was found that the *eel was on the shoe.
It was found that the *eel was on the
orange.
It was found that the *eel was on the table.
 The identification of the missing word
depends on what happens after it.
Models of Object Perception
 Two competing models explain how
context and feature information are
combined:


Massaro’s FLMP (fuzzy logic model of
perception) -- Context and detail are two
independent sources of information.
McClelland & Rumelhart’s PDP model –
connectionist model in which both sources
of information interact.
Testing the FLMP Model
 Four kinds of stimuli:




Only an e can make a real word.
Only a c can make a real word.
Both letters can make a word.
Neither letter can make a word.
 Within each group, stimuli go from e to c.
 Subjects saw each stimulus word briefly
and had to identify the letter, e or c.
Testing the Fuzzy Logic Model
FLMP Results
 Observed frequencies for naming a letter
e increase as it has more e features, but
also as the context demands an e.
 Baye’s theorem gives a formula for
combining the independent contributions
of two sources of information.
 Massaro’s results conform to predictions
of Baye’s theorem, suggesting that the
information sources must be
independent of each other.
Testing the PDP Model
 Activation spreads from features to
excite letters and from letters to excite
words (bottom up processing).
 Activation also spreads from words to
the component letters (top-down
processing).
 The more activation, the more likely the
correct letter will be identified:

TRAP vs TRIP
Faces and Scenes
 When parts are presented in isolation,
more feature information is needed to
recognize them.


Face parts are recognized with less detail
when in the context of a face.
Subjects are better able to identify objects
when they are part of coherent novel
scenes rather than jumbled scenes.
Jumbled Scenes
The same details are in both stimuli but people identify more
objects when the overall scene makes sense.
Change Blindness
 People cannot keep track of all of the
information in a complex scene.
 If change occurs during a scene-cut or
eye movement and it fits the context, it
may be missed.



Large changes can be overlooked.
7 of 15 participants noticed that the person
changed entirely while giving directions
Demo: http://viscog.beckman.illinois.edu/djs_lab/demos.html
Change Blindness
Marr
 Depth cues (texture gradient, stereopsis)
– where are edges in space?
 How are visual cues combined to form
an image with depth?



Primal sketch – extracts features.
2-1/2 D sketch – identifies where visual
features are in relation to observer (depth).
3-D model – refers to the representation of
the objects in a scene, combines context.
Stages in Marr’s Model
Putting it All Together
 The output of these stages is a
representation of an object and its
location.
 This output is used as input to higherlevel cognitive processes.
 Conscious awareness (a higher-level
process) involves the recognition stage,
but lots of processing occurs first.
Download