Basic distinctions Definitions Epstein (1965) familiar size experiment

advertisement
Integration of cues
•
•
•
•
•
Quick review of depth cues
Cue combination: Minimum variance
Cue combination: Bayesian
Nonlinear cue combination: Causal models
Statistical decision theory
Basic distinctions
Distance, depth, and 3D shape
cues
• Pictorial depth cues: familiar size, relative size,
brightness, occlusion, shading and shadows,
aerial/atmospheric perspective, linear perspective,
height within image, texture gradient, contour
• Other static, monocular cues: accommodation, blur,
[astigmatic blur, chromatic aberration]
• Motion cues: motion parallax, kinetic depth effect,
dynamic occlusion
• Binocular cues: convergence, stereopsis/binocular
disparity
• Cue combination
Definitions
What is the information?
How could one compute depth from it?
Do we compute depth from it?
What is learned: ordinal, relative, absolute depth,
depth ambiguities
• Distance: Egocentric distance, distance
from the observer to the object
• Depth: Relative distance, e.g., distance
one object is in front of another or in front
of a background
• Surface Orientation: Slant (how much) and
tilt (which way)
• Shape: Intrinsic to an object, independent
of viewpoint
Distance, depth, and 3D shape
cues
Epstein (1965) familiar size
experiment
• Types of depth cues
– Monocular vs. binocular
– Pictorial vs. movement
– Physiological
• Depth cue information
–
–
–
–
• Pictorial depth cues: familiar size, relative size,
[brightness], occlusion, shading and shadows,
aerial/atmospheric perspective, linear perspective,
height within image, texture gradient, contour
• Other static, monocular cues: accommodation, blur,
[astigmatic blur, chromatic aberration]
• Motion cues: motion parallax, kinetic depth effect,
dynamic occlusion
• Binocular cues: convergence, stereopsis/binocular
disparity
How far away
is the coin?
1
Monocular depth cues
Relative size as a cue to depth
Retinal projection
depends on size
and distance
Relative size as a cue to depth
Shading, reflection, and illumination
illumination
occlusion
Occlusion as a cue to depth
Shading – prior of light-from-above
reflection shading
2
Shading (flip the photo upside-down)
Dynamic Cast Shadows
Aerial/Atmospheric Perspective
Cast Shadows
Shading and contour
Geometry of Linear Perspective
Retinal projection
depends on size and
distance:
Size in the world
(e.g., in meters) is
proportional to size in
the retinal image (in
degrees) times the
distance to the object
3
Linear perspective
Texture
Size
constancy
Height Within the Image
1. Density
2. Foreshortening
3. Size
Distance, depth, and 3D shape
cues
• Pictorial depth cues: familiar size, relative size,
brightness, occlusion, shading and shadows,
aerial/atmospheric perspective, linear perspective,
height within image, texture gradient, contour
• Other static, monocular cues: accommodation, blur,
[astigmatic blur, chromatic aberration]
• Motion cues: motion parallax, kinetic depth effect,
dynamic occlusion
• Binocular cues: convergence, stereopsis/binocular
disparity
Monocular Physiological Cues
• Accommodation – estimate depth based
on state of accommodation (lens shape)
required to bring object into focus
• Blur – objects that are further or closer
than the accommodative distance are
increasingly blur
• Astigmatic blur
• Chromatic aberration
4
Distance, depth, and 3D shape
cues
Motion Parallax
• Pictorial depth cues: familiar size, relative size,
brightness, occlusion, shading and shadows,
aerial/atmospheric perspective, linear perspective,
height within image, texture gradient, contour
• Other static, monocular cues: accommodation, blur,
[astigmatic blur, chromatic aberration]
• Motion cues: motion parallax, kinetic depth effect,
dynamic occlusion
• Binocular cues: convergence, stereopsis/binocular
disparity
The Kinetic Depth Effect
Dynamic (Kinetic) Occlusion
Distance, depth, and 3D shape
cues
Vergence Angle As One
Binocular Source
• Pictorial depth cues: familiar size, relative size,
brightness, occlusion, shading and shadows,
aerial/atmospheric perspective, linear perspective,
height within image, texture gradient, contour
• Other static, monocular cues: accommodation, blur,
[astigmatic blur, chromatic aberration]
• Motion cues: motion parallax, kinetic depth effect,
dynamic occlusion
• Binocular cues: convergence, stereopsis/binocular
disparity
5
Vergence Angle As One
Binocular Source
Vergence Angle As One
Binocular Source
Disparity
Vergence Angle As One
Binocular Source
Binocular
disparity
Uncrossed disparity
Disparity
Zero retinal disparity
Crossed disparity
6
How to make a random-dot stereogram
Left eye image
x
A
B
Right eye image
A
y
B
Depth Cue Combination: Issues
1. How do you put all of the depth cue
information together?
2. What do you do when cues disagree?
A little … ?
errors
A lot … ?
3. How much weight should each cue
get?
When cues disagree …
Information Fusion Problem
Multiple sources of information,
possibly in error, possibly
contradictory
1:1
Linear Perspective
2:1
Relative size
6 feet
Accommodation
How combine the information into
a single judgment?
Rashomon
7
Optimal Cue Combination: Minimum Variance
Minimum-Variance Cue Combination
E(X i ) = µ1, E(X 2 ) = µ2
Variances: ! 22 " ! 12
Just use one cue?
Suppose we use a linear cue-combination rule:
X
X
(
)
( )
unbiased
( ) (
)
( )
2
Var X = w 2Var X 1 + 1! w Var X 2
weighted linear
combination
= w1X 1 + w 2 X 2
(
= wX 1 + 1! w X 2
(
)
2
= w 2" 12 + 1! w " 22
)
minimize
E #$ X %& = w1E #$ X 1 %& + w 2E #$ X 2 %& = w1 + w 2 µ
unbiased?
Var ( X ) = w 2! 12 + (1 " w ) ! 22
2
Minimum-Variance Cue Combination
(
)
Var[X]
X = wX 1 + 1! w X 2
( )
( ) (
)
2
( )
Var X = w 2Var X 1 + 1! w Var X 2
Choose w to minimize variance:
w=
w
1/ " 12
1/ " 12 + 1/ " 22
Perturbation Methodology and
Influence Measures
Reparamerization
Define reliability ri = ! i"2
X = w1X 1 + w 2 X 2
w=
1/ ! 12
1/ ! 12 + 1/ ! 22
r = r1 + r2
weight proportional to reliability
=
r1
r1 + r2
reliabilities add
How can we measure the influence of various
cues on perceptual judgments in complex scenes?
Goal: Change the stimulus as little as we possibly
can.
8
Perturbation Method
Example: Texture and Motion
Observer adjusts ...
Two Depth Cues
Perturbation Method
Perturbation Method
Observer adjusts ...
One possibility ...
?
Perturbed
Cue ….
Two Depth Cues
Two Depth Cues
Perturbation Method
Observer is using only
the perturbed cue.
Perturbed
Cue ….
Perturbation Method
One possibility ...
Another possibility ...
Perturbed
Cue ….
Two Depth Cues
Perturbed
Cue ….
Two Depth Cues
9
Perturbation Method
Observer is not using
the perturbed cue at all.
Perturbation Method
Another possibility ...
A final possibility ...
Perturbed
Cue ….
Two Depth Cues
Two Depth Cues
Perturbation Method
We will measure the
influence of the cue
on the observer’s setting.
Perturbed
Cue ….
Influence Measures
Change in
observer’s setting
A final possibility ...
I cue =
Perturbed
Cue ….
Two Depth Cues
Texture and Motion: Data
! setting
! cue
Influence of
the cue
Perturbation of
the cue
Optimal Cue Combination: Bayesian
Compute posterior:
p(depth | x1, x2 ) =
p(x1, x2 | depth)p(depth)
p(x1, x2 )
Assume conditional independence:
p(depth | x1, x2 ) ! p(x1 | depth)p(x2 | depth)p(depth)
If likelihoods and prior are Gaussian, so is
posterior, and means and reliabilities are as in
minimum-variance case. Prior acts like a static cue.
10
Optimal Cue Combination: Bayesian
Optimal Cue Combination
p(depth | x1, x2 ) ! p(x1 | depth)p(x2 | depth)p(depth)
Depending on cost function and priors, choose:
ML: Maximum-likelihood estimator
MAP: Maximum a posteriori estimator
Mean of the posterior
Median of the posterior
Etc.
Rock & Victor (1964)
Visual/Haptic Setup
View object through distorting lens while
exploring object haptically
Irv Rock
Visual
capture
Visually and haptically specified shapes differed.
What shape is perceived?
Visual Capture ?
Why should vision be the “gold standard”
all other modalities are compared to?
Visual Capture ?
Why should vision be the “gold standard”
all other modalities are compared to?
Weights
Weights
!2
wV = 2 H 2
!V +! H
wV =
Variance
SVH = wV SV + w H S H
1
1
1
= 2 + 2
2
! VH
!V ! H
! 2H
! V2 + ! 2H
Variance
SVH = wV SV + w H S H
1
1
1
= 2 + 2
2
! VH
!V ! H
11
Visual Capture ?
2-IFC Task
Why should vision be the “gold standard”
all other modalities are compared to?
Standard
Comparison
Weights
wV =
! 2H
2
! V + ! 2H
Variance
SVH = wV SV + w H S H
no feedback!
Visual-Haptic
1
1
1
= 2 + 2
2
! VH
!V ! H
12
Individual Differences
1
HTE
0% noise
133% noise
Empirical Visual Weight
0.8
RSB
JWW
MOE
0.6
JWW RSB
0.4
KML
LAS
KML
MOE
0.2
HTE
LAS
0
0.2
0.4
0.6
0.8
1
Predicted Visual Weight
Nonlinear Cue Combination: Causal models
Nonlinear Cue Combination: Causal models
Visual Auditory combination (Ventriloquist effect)
Traditional Bayesian model
Infer
Generate
Modeling: Where do cues come
from?
13
What would happen now?
Mixture model
or
Obviously there may be more than one source.
p(causal model)
Independent causes: where is
the auditory stimulus
• Using Bayes rule:
Best estimate
Common cause: where is
the auditory stimulus
Mean squared error estimate
Best estimate
Best estimate
14
Experimental test
Measured gain
Button: common cause
or two
Data
Kording et al
Sato et al, 2007
Wallace et al. 2005; Hairston et al. 2004
Predicting the variance
How can the gain be negative?
perceived
visual
stimulus
Standard Deviation
perceived
auditory
stimulus
10
0
-15
0
15
Spatial Disparity (deg.)
Worse prediction if we assume model selection
Statistical Decision Theory
The Three Elements of SDT
David Blackwell
W
A
John von Neumann
X
Abraham Wald
= {w1,w 2 ,!,w m }
= {a1, a2 ,!, ap }
= {x1, x2 ,!, xn }
possible states of the world
possible actions
possible sensory events
X ! f (x;! )
Oskar Morgenstern
1954
15
The Three Functions of SDT
The Three Functions of SDT
α,
in
ga
α,
in
ga
G(
ho
ho
G(
w|
e li
e li
x)
lik
w|
lik
od
od
decision
A
on
pti
rce )
x
Pe
L(
L(
w)
W
w)
W
X
d (x)
Fig. 1
decision
A
X
d (x)
Fig. 1
The Three Functions of SDT
The Three Functions of SDT
W
nc
ue
G(
in
ga
α,
w)
od
Co
ns
eq
in
ga
α,
G(
ho
Action
Fig. 1
e li
od
A
w|
ho
X
lik
e li
decision
d (x)
L(
w|
lik
A
on
pti
rce )
x
Pe
on
pti
rce )
x
Pe
L(
w)
e
W
decision
X
d (x)
Action
Fig. 1
d:X !A
EG( d,w2 )
Goal: select
1
3
5
7
2
4
6
8
EG( d,w1 )
16
Partial ordering: dominance
1
3
5
7
2
4
1
EG( d,w2 )
EG( d,w2 )
Admissible rules
3
5
7
2
6
4
6
8
8
EG( d,w1 )
EG( d,w1 )
Mixture rules
1
Mixtures of 7 and 8
3
5
7
2
4
EG( d,w2 )
EG( d,w2 )
Mixture rules
1
3
All possible mixtures
rules
All possible admissible
rules
5
7
2
6
4
8
6
8
EG( d,w1 )
EG( d,w1 )
Aside: SDT
Maximin Criterion
EG( d,w2 )
Partial ordering of rules
d: X  A
The Maximin
Rule(s)
EG( d,w1 )
17
Maximin Criterion
Bayesian Decision Theory
W
The Maximin
Rule(s)
! (w )
w)
in
eq
ns
ga
α,
Co
G(
Dr Evil
A
on
d
a ti
oo
o rm
lih
X)
lik e
In f
w|
o ry
L(
EG( d,w2 )
ns
ue
nc
Se
e
prior
decision
X
d (x)
“Cognition”
EG( d,w1 )
Bayesian Decision Theory
The Bayes Rule(s)
n
EBR (d ) = " EG (d ,w i )! (w i )
i =1
The rule d* that maximizes Bayes Risk
Is the Bayes Rule (it need not be unique).
EG( d,G)
We add a prior probability distribution on
The state of the world and define the
Bayes Risk of each rule.
(! (R ),! (G ))
EG( d,R)
Bayesian Decision Theory (Continuous Form)
Maximize expected Bayes gain
EBG (d ) = "" G (d (x ),w ) L (w | x ) ! (w ) dx dw
Bayesian Decision Theory
Geisler (1989)
"" G (d (x ),w ) L (w | x ) ! (w ) dx dw
by choice of a decision rule
d:X !A
18
Bayesian Decision Theory
Geisler (1989)
A choice
Normative
"" G (d (x ),w ) L (w | x ) ! (w ) dx dw
Knill & Richards (1996)
Geisler (1989)
Descriptive (process)
How test?
Perception as Bayesian
Inference
Optimal Cue Combination
Individual Differences
1
HTE
0% noise
133% noise
Empirical Visual Weight
0.8
RSB
JWW
MOE
0.6
JWW RSB
0.4
KML
KML
LAS
MOE
0.2
HTE
LAS
0
0.2
0.4
0.6
0.8
1
Predicted Visual Weight
What is the gain/loss function?
Quadratic loss (least squares)
Process model is weighted
linear combination minimizing
quadratic loss
S = w HSH + wV Sv
A weak test of BDT …
19
Bayesian Decision Theory
Geisler (1989)
"" G (d (x ),w ) L (w | x ) ! (w ) dx dw
loss function?
All that
really
matters …
Knill & Richards (1996)
Can the visuo-motor system,
presented with arbitrary gain
functions, select decision rules
that maximize expected gain?
Direct manipulation of gain/loss
function
loss function?
loss function!
Motor Timing Experiment:
400
450
500
550
600
650
700
Strong test of BDT
Practice phase:
400
450
500
550
600
650
700
Time bar
Target
20
Practice phase:
400
450
500
550
600
Practice phase:
650
700
400
450
500
550
600
650
700
650
700
Produce a movement to target
Of duration 650 msec
Practice phase:
400
450
500
550
600
Practice phase:
650
700
Try again!
400
450
500
550
600
Better!
Calibration SD’s
Experiment: Main Task
Subject HT
SD (ms)
-15
400
450
500
+5
550
600
650
0
700
Make money ….
Reach target time (ms)
21
EG (t0 ) = pG (t0 )G + pR (t0 )R
EG
Experiment: Main Task
How to maximize expected gain?
-15
400
450
+5
500
550
600
650
0
t0 movement strategy S
0
G
700
L
t
EG
EG (t0 ) = pG (t0 )G + pR (t0 )R
EG
EG (t0 ) = pG (t0 )G + pR (t0 )R
0
0
t0
G
L
G
t
L
t
EG (t0 ) = pG (t0 )G + pR (t0 )R
EG
Note SD
t0
Configurations
-15
+5
400
0
t0
450
500
550
0
650
+5
450
500
550
-15
L
600
700
-15
2.
400
G
0
1.
optimum
600
650
700
0 +5
0
3.
400
t
450
500
550
600
0
650
+5
700
0
-15
4.
400
450
500
550
600
650
700
22
HT
5
646 ms
697 ms
0
Expected Gain
Expected Gain
5
HT
646 ms
697 ms
0
699.6 ms
-5
300 400 500 600 700 800 900 1000
-5
300 400 500 600 700 800 900 1000
time (ms)
time (ms)
5
5
HT
646 ms
612 ms
0
Expected Gain
Expected Gain
HT
-5
612 ms
0
613.6 ms
-5
300 400 500 600 700 800 900 1000
300 400 500 600 700 800 900 1000
time (ms)
HT
time (ms)
5
646 ms
666 ms
0
Expected Gain
Expected Gain
5
646 ms
HT
646 ms
666 ms
0
678.2 ms
-5
300 400 500 600 700 800 900 1000
-5
300 400 500 600 700 800 900 1000
time (ms)
time (ms)
23
HT
5
646 ms
Expected Gain
Expected Gain
5
631 ms
0
HT
646 ms
631 ms
0
647.4 ms
-5
300 400 500 600 700 800 900 1000
-5
300 400 500 600 700 800 900 1000
time (ms)
time (ms)
Observed reach time (ms)
Summary
700
Subjects chose movements whose
mean time came close to maximizing
expected gain.
No patterned deviations.
650
600
600
650
700
Optimal reach time (ms)
Bayesian Decision Theory
"" G (d (x ),w ) L (w | x ) ! (w ) dx dw
Bayesian Decision Theory
"" G (d (x ),w ) L (w | x ) ! (w ) dx dw
24
Conclusions
Gain/loss functions are problems posed by
the environment to the organism
They are an useful as independent
variables in exploring visuo-motor function
Their manipulation allows us to test
performance against ideal in a wide range
of economic games
25
Download