NS2_LowAndMidLevelVi.. - Center for Neural Science

advertisement
Sensory and Motor Systems (G80.2202)
Psychophysics of Early & Mid-level Vision
Instructor: Nava Rubin
Early Psychophysics (history-wise and level-wise)
Weber’s Law (1834)
let i denote stimulation intensity, and
let Di denote the minimal increase in intensity that an
observer can detect; the following holds:
Di = constant
i
Fechner’s insight (1860) : this corresponds to ‘constant
increments of sensation’, Ds. Therefore :
Ds { 32
Ds = k
i
28
24
Ds { 20
16
12
s = k log(i) + C
8
4
0
{
From here we can deduce
the form of S(i) :
integrate, 
Di
0Di
5000
Di
10000
8
Weber–Fechner law: S = k ln(i)
[sensation log(intensity of stimulation)]
Example of a behavioral derivation of a neurally-based law
(measured physiological only later*)
Ernst Weber (1795–1878)
Gustav Fechner (1801–1887)
Luminance Gain Control
(aka “light adaptation”)
“surround”
“center”
Sensitivity
&
RGC response
(normalized)
(schematic)
Background lum:
1
100
0.5
1000
10,000
0
1
10
100
1000
lum in
center RF
Since luminance can potentially vary over an extremely wide
range, the visual system (specifically, RGCs) adjust their
sensitivity to match the locally prevalent luminance.
This is done by roughly dividing the (within-RF) luminance by the local mean
luminance of the immediate surrounding (a few degrees outside the RF).
Contrast Gain Control
Contrast gain control begins in the retina and is strengthened at
subsequent stages of the visual system. It roughly divides the
responses by a measure that grows with the locally prevalent
root-mean-square (r.m.s.) contrast, or the standard deviation of
the stimulus luminance divided by the mean luminance.
http://www.uni-mannheim.de/fakul/psycho/irtel/cvd/C4700.html
Young & Helmholtz:
Experiments in Additive Colors
(schematic)
“The Trichromatic Theory of Color”
Color metamers:
Two different spectral distributions that produce the
same perceived color (in a given observer)
||
Two different spectral distributions that produce the
same stimulation of the L,M and S cones (of a given
observer)
Example: Yellow (~570nm or mix red and green)
‘S’
‘M’ ‘L’
Ewald Hering (1834-1918):
-Why does red produce a greenish after-effect ?
(and vice versa)
-Why does yellow produce a bluish after-effect ?
(and vice versa)
-Why do we perceive the superposition of ‘basic’
colors as “white”?
-What does ‘white’ mean?? (Is it a property of the
‘outside’ world, or a property of our perceptual machinery?)
Hering’s Theory:
-The visual system generates color signals in opponent pairs
-
(yellow-blue, red-green, white-black).
-At the time, it was seen by many to compete with the trichromatic
theory, but Hering held that both theories could be valid.
-We now know he was correct: the two theories simply describe
visual processes that occur at different levels. But it was not until
much later in the twentieth century that neural experiments
proved him correct.
Color adaptation (‘after-effects’): a demo
The Atomistic Approach to Psychophysics:
the search for “atoms” of perception
Wilhelm Wundt
(1832 -1920)
Limitations of the “atomistic” approach: Color
Color Constancy
Color Constancy: the tendency of surfaces to preserve
their perceived color even when their emission
spectrum changes dramatically (because of a change
in the spectrum of the light they are reflecting)
Limitations of the “atomistic” approach: Brightness
E.H. Adelson, MIT
The Atomistic Approach, take 2:
explaining visual perceptual phenomena
with independent filters / channels
Fergus Campbell
(1924-1993)
Detection Thresholds and Linear Systems Analysis
Graham N & Nachmias J (1971), Detection of grating patterns containing two
spatial frequencies: a comparison of single-channel and multiple-channels
models. Vision Research 11(3) 251-9.
(script 1)
Detection Thresholds and Linear Systems Analysis
components
‘single channel’
prediction
‘multi-channel’
prediction
Results:
•The Appeal of the ‘multiple channel’ approach is tightly
linked to the expectation that the response of the system
to a ‘compound’ stimulus could be predicted from its
response to the constituent components (e.g., in the case
of a visual pattern, from the response to its Fourier
components).
•Such a system is called a linear system:
R(A + B + C …) = R(A) + R(B) + R(C) + …
•Another way to put it is that the channels are expected to
be non-interacting, i.e. that the response of one channel
(to its own component) does not depend on the input to
the other channels.
•How valid is this expectation [assumption] for sensation
and perception?
Limitations of Linear Systems Approach:
the role of relative phase
Piotrowski LN &
Campbell FW,
A demonstration of the
visual importance and
flexibility of spatialfrequency amplitude
and phase.
Perception.
1982;11(3):337-46.
Why is relative phase so crucial to appearance?
Hint: what is the Fourier spectra of an edge?
(and of 1/f noise?...
)
And why is threshold-detection nonetheless linear???
(script 2)
Different levels of visual processing
[ (i) the boundaries are not 100% sharp; (ii) not a universal agreement on definitions]
Low-level: processes that are achieved by an array of filters
that have relatively small receptive fields, tile the visual field
(w/ overlap), and are non- or minimally-interacting.
Examples: center-surround contrast detection in LGN;
orientation selectivity in V1.
Mid-level: processes that (i) group visual information about
surface fragments that are disjoint in space and/or time;
(ii) segment visual information into separate spatial and
temporal entities.
Example . . . . .
High-level: visual recognition processes; rely on prior
knowledge of specific objects or classes of objects (their
visual properties, semantic and/or lexical knowledge).
One Object or Two Sets of Lines?
More lines
(Adapted from Lorenceau and Shiffrar 1992)
Show All?
Will dots help?
Mid-level visual processing: revisit definition
Low-level: processes that are achieved by an array of filters that have relatively
small receptive fields, tile the visual field (w/ overlap), and are non- or minimallyinteracting.
Examples: center-surround contrast detection in LGN; orientation selectivity in
V1.
Mid-level: processes that (i) group visual information about
surface fragments that are disjoint in space and/or time*;
(ii) segment visual information into separate spatial and
temporal entities.
 Requires compilation of visual information from spatially
and/or temporally disparate sources.
a.k.a: “Perceptual Organization”; “Gestalt processing”; …
* Note: earliest in the visual pathway (ie retina), even physically contiguous
surface portions may not be represented as a unitary entity (‘thing’), and therefore
an overall change in neural representation may need to occur in cortex.
High-level: visual recognition processes; rely on prior knowledge of specific
objects or classes of objects (their visual properties, semantic and/or lexical
knowledge).
The Gestalt Psychology Movement
(Wertheimer, Kohler, Koffka, …):
Perceptions are Gestalts*
* -- “a whole that is more than the sum of its parts”
put differently:
PERCPTION IS FUNDAMENTALLY NON-LINEAR
( the atomistic approach is doomed)
Emphasis on “perceptual organization”
Sensory and Motor Systems (G80.2202)
Psychophysics of Early & Mid-level Vision
Part ii
Motion Integration and Segmentation: Plaids
(Wallach 1935, 1976; Adelson & Movshon 1982; Hupe & Rubin 2003)
Reminder: show diff Alphas
1
Edges in Motion:
Segmentation & integration in real-world images
(Rubin and Albert , VSS 2001)
Global Motion Processing:
Local velocity measurements are ambiguous …
“The aperture problem”
Marr & Ullman (1981)
… is present not only for straight lines:
?
It is really just a subset of …
“The correspondence
problem”
Ullman (1979):
“The identity problem”
…and do not convey
Wallach (1935, 1976):
veridical information
about the object’s
global motion.
1D Cues, 2D cues and 3D motion
1D and 2D motion cues
(From Pack et al. 2003)
Using short-bar stimuli and a reverse-correlation technique, Pack et al
(2003) showed that the responses of end-stopped cells in V1 reliably signal
the 2D motion direction of a bar’s endpoints, regardless of its orientation
(i.e., these cells do not suffer from “the aperture problem”).
end-stopped
cell:
non
endstopped
cell:
Back to Plaid Perception :
Motion Integration
Motion Segmentation
“Coherency”
“Transparency”
2D motion signals
1D motion signals
Back to ‘One Object, or Two Pairs of
More lines
Lines?’
Scene Segmentation affects the assignment
(After Shimojo,
of local motion cues as ‘intrinsic’ vs. ‘extrinsic’
Silverman &
Nakayama, VR
1989)
(Demo
adapted from
Lorenceau & Shiffrar
VR 1992)
Show All?
Will dots help?
Download