Integration of cues • • • • • Quick review of depth cues Cue combination: Minimum variance Cue combination: Bayesian Nonlinear cue combination: Causal models Statistical decision theory Basic distinctions • Types of depth cues – Monocular vs. binocular – Pictorial vs. movement – Physiological • Depth cue information – – – – What is the information? How could one compute depth from it? Do we compute depth from it? What is learned: ordinal, relative, absolute depth, depth ambiguities Distance, depth, and 3D shape cues • Pictorial depth cues: familiar size, relative size, [brightness], occlusion, shading and shadows, aerial/ atmospheric perspective, linear perspective, height within image, texture gradient, contour • Other static, monocular cues: accommodation, blur, [astigmatic blur, chromatic aberration] • Motion cues: motion parallax, kinetic depth effect, dynamic occlusion • Binocular cues: convergence, stereopsis/binocular disparity Distance, depth, and 3D shape cues • Pictorial depth cues: familiar size, relative size, brightness, occlusion, shading and shadows, aerial/ atmospheric perspective, linear perspective, height within image, texture gradient, contour • Other static, monocular cues: accommodation, blur, [astigmatic blur, chromatic aberration] • Motion cues: motion parallax, kinetic depth effect, dynamic occlusion • Binocular cues: convergence, stereopsis/binocular disparity • Cue combination Definitions • Distance: Egocentric distance, distance from the observer to the object • Depth: Relative distance, e.g., distance one object is in front of another or in front of a background • Surface Orientation: Slant (how much) and tilt (which way) • Shape: Intrinsic to an object, independent of viewpoint Epstein (1965) familiar size experiment How far away is the coin? Monocular depth cues Relative size as a cue to depth Retinal projection depends on size and distance Relative size as a cue to depth Shading, reflection, and illumination illumination occlusion reflection shading Occlusion as a cue to depth Shading – prior of light-from-above Shading (flip the photo upside-down) Dynamic Cast Shadows Aerial/Atmospheric Perspective Cast Shadows Shading and contour Geometry of Linear Perspective Retinal projection depends on size and distance: Size in the world (e.g., in meters) is proportional to size in the retinal image (in degrees) times the distance to the object Linear perspective Texture Size constancy Height Within the Image 1. Density 2. Foreshortening 3. Size Distance, depth, and 3D shape cues • Pictorial depth cues: familiar size, relative size, brightness, occlusion, shading and shadows, aerial/ atmospheric perspective, linear perspective, height within image, texture gradient, contour • Other static, monocular cues: accommodation, blur, [astigmatic blur, chromatic aberration] • Motion cues: motion parallax, kinetic depth effect, dynamic occlusion • Binocular cues: convergence, stereopsis/binocular disparity Monocular Physiological Cues • Accommodation – estimate depth based on state of accommodation (lens shape) required to bring object into focus • Blur – objects that are further or closer than the accommodative distance are increasingly blur • Astigmatic blur • Chromatic aberration Distance, depth, and 3D shape cues Motion Parallax • Pictorial depth cues: familiar size, relative size, brightness, occlusion, shading and shadows, aerial/ atmospheric perspective, linear perspective, height within image, texture gradient, contour • Other static, monocular cues: accommodation, blur, [astigmatic blur, chromatic aberration] • Motion cues: motion parallax, kinetic depth effect, dynamic occlusion • Binocular cues: convergence, stereopsis/binocular disparity The Kinetic Depth Effect Dynamic (Kinetic) Occlusion Distance, depth, and 3D shape cues Vergence Angle As One Binocular Source • Pictorial depth cues: familiar size, relative size, brightness, occlusion, shading and shadows, aerial/ atmospheric perspective, linear perspective, height within image, texture gradient, contour • Other static, monocular cues: accommodation, blur, [astigmatic blur, chromatic aberration] • Motion cues: motion parallax, kinetic depth effect, dynamic occlusion • Binocular cues: convergence, stereopsis/binocular disparity Vergence Angle As One Binocular Source Vergence Angle As One Binocular Source Disparity Vergence Angle As One Binocular Source Binocular disparity Uncrossed disparity Zero retinal disparity Crossed disparity Disparity Depth Cue Combination: Issues When cues disagree … 1. How do you put all of the depth cue information together? 2. What do you do when cues disagree? A little … ? errors A lot … ? 1:1 2:1 6 feet 3. How much weight should each cue get? Linear Perspective Relative size Accommodation Optimal Cue Combination: Minimum Variance Information Fusion Problem Multiple sources of information, possibly in error, possibly contradictory How combine the information into a single judgment? E(X i ) = µ1, E(X 2 ) = µ2 Variances: ! 22 " ! 12 Just use one cue? Suppose we use a linear cue-combination rule: X weighted linear combination = w1X 1 + w 2 X 2 ( unbiased? Rashomon ( ) = wX 1 + 1! w X 2 ( ) unbiased ( ) ( ) 2 ( ) Var X = w 2Var X 1 + 1! w Var X 2 ( ) 2 = w 2" 12 + 1! w " 22 Var[X] Minimum-Variance Cue Combination X ) E #$ X %& = w1E #$ X 1 %& + w 2E #$ X 2 %& = w1 + w 2 µ minimize w Minimum-Variance Cue Combination ( Reparameterization ) X = wX 1 + 1! w X 2 ( ) Define reliability ri = ! i"2 ( ) ( ) ( ) 2 X = w1X 1 + w 2 X 2 Var X = w 2Var X 1 + 1! w Var X 2 1/ ! 12 w= 1/ " 12 r = r1 + r2 1/ " 12 + 1/ " 22 Cue Combination for Estimation reliabilities add Combining Sensory Estimates • Weighted average: Ŝ = w H ŜH + wV ŜV D(x,y ) = ! sDs (x,y ) + ! mDm (x,y ) + ! t Dt (x,y ) + ! where "! i 47 1/ " r = i 2 1/ " # j # rj j Combining Sensory Estimates S SH V 47 50 53 Size (mm) SKI – Aug. 13, 2003 r wH = H rH + rV r wV = V rH + rV wH = rH rH + rV wV = rV rH + rV 53 SKI – Aug. 13, 2003 Combining Sensory Estimates Ŝ = w H ŜH + wV ŜV S 50 Size (mm) 2 i j V =1 • Optimal weights for independent cues: SKI – Aug. 13, 2003 S SH i !i = weight proportional to reliability r w= = 1 1/ ! 12 + 1/ ! 22 r1 + r2 Choose w to minimize variance: Ŝ = w H ŜH + wV ŜV S S SH V 47 50 53 Size (mm) SKI – Aug. 13, 2003 2 ! HV = ! H2 ! V2 ! H2 + ! V2 Combining Sensory Estimates Ŝ = w H ŜH + wV ŜV S S SH V 47 50 2 ! HV = 53 ! H2 ! V2 ! H2 + ! V2 rHV = rH + rV Size (mm) Perturbation Methodology and Influence Measures How can we measure the influence of various cues on perceptual judgments in complex scenes? Goal: Change the stimulus as little as we possibly can. Variance of combined estimate lower than variance of either single-cue estimate SKI – Aug. 13, 2003 Perturbation Method Example: Texture and Motion Observer adjusts ... Two Depth Cues Perturbation Method Perturbation Method Observer adjusts ... One possibility ... ? Perturbed Cue …. Two Depth Cues Perturbed Cue …. Two Depth Cues Perturbation Method Observer is using only the perturbed cue. Perturbation Method One possibility ... Another possibility ... Perturbed Cue …. Perturbed Cue …. Two Depth Cues Two Depth Cues Perturbation Method Perturbation Method Observer is not using the perturbed cue at all. Another possibility ... A final possibility ... Perturbed Cue …. Two Depth Cues Two Depth Cues Perturbation Method We will measure the influence of the cue on the observer’s setting. Perturbed Cue …. A final possibility ... An Experimental Paradigm: Perturbation Analysis The observerʼs cue weights can be estimated. The stimulus comparison: Cue1 = d Cue2 = d Perturbed Cue …. Cue1 = d1 Therefore Two Depth Cues SKI – Aug. 13, 2003 !1 = Matches Cue2 = d2 = d1 + !d d " d 1 #depth = d 2 " d1 #cue Influence Measures Texture and Motion: Data Change in observer’s setting Influence of the cue Perturbation of the cue Optimal Cue Combination: Bayesian Compute posterior: p(x1, x2 | depth)p(depth) p(depth | x1, x2 ) = p(x1, x2 ) Assume conditional independence: p(depth | x1, x2 ) ! p(x1 | depth)p(x2 | depth)p(depth) If likelihoods and prior are Gaussian, so is posterior, and means and reliabilities are as in minimum-variance case. Prior acts like a static cue. Optimal Cue Combination: Bayesian p(depth | x1, x2 ) ! p(x1 | depth)p(x2 | depth)p(depth) Depending on cost function and priors, choose: ML: Maximum-likelihood estimator MAP: Maximum a posteriori estimator Mean of the posterior Median of the posterior Etc. Rock & Victor (1964) Optimal Cue Combination View object through distorting lens while exploring object haptically Irv Rock Visual capture Visually and haptically specified shapes differed. What shape is perceived? Visual/Haptic Setup Visual Capture ? Why should vision be the “gold standard” all other modalities are compared to? Weights Variance Visual Capture ? Visual Capture ? Why should vision be the “gold standard” all other modalities are compared to? Why should vision be the “gold standard” all other modalities are compared to? Weights Weights Variance Variance 2-IFC Task Standard Comparison no feedback! Visual-Haptic Individual Differences 1 HTE 0% noise 133% noise Empirical Visual Weight 0.8 Cue 1: Spatial Frequency RSB JWW MOE 0.6 JWW RSB 0.4 KML LAS KML MOE 0.2 HTE LAS 0 0.2 0.4 0.6 0.8 Predicted Visual Weight 1 Landy & Kojima (2001) Task: Vernier Cue 2: Orientation Landy & Kojima (2001) Landy & Kojima (2001) Texture: Results Texture: Fitted Reliabilities ! o = 9 arc min, ! f or ! c = 9 arc min ! o = 36 arc min, ! f or ! c = 9 arc min ! = 9 arc min, ! or ! = 36 arc min 15 300 0 -15 -15 0 "cue (arc min) "cue (arc min) (a) (b) 15 MJY Expt. 2 0 -15 -15 0 "cue (arc min) Landy & Kojima (2001) (c) 15 15 (arc min) 0 MSL Expt. 1 MSL Expt. 2 10 1 0 -15 -15 100 ! 15 PSE (arc min) PSE (arc min) 0 -15 -15 c 15 MJY Expt. 1 PSE (arc min) PSE (arc min) 15 f cue o 0 9 36 9 36 !o !f 9 36 9 36 !o !f MJY MSL 15 "cue (arc min) (d) Demo: Landy/Kojima psychophysical task Landy & Kojima (2001) of priors: Mamassian Landy (1998, 2001) 24Influence Orientations & Mirror&Reflections 0 deg 15 deg 30 deg 45 deg Viewpoint Light Direction Viewpoint Light Direction 1 Thin Score 0.75 R5 L5 1 0.5 0.25 Thin Score 0.75 0 -180 -135 -90 -45 0 45 90 135 180 -180 -135 -90 Orientation (deg) -45 0 45 90 135 180 Orientation (deg) 0.5 0.25 0 -180 -135 -90 -45 0 45 90 135 180 Orientation (deg) Shading Contrast = 0.1 Contour Contrast = 0.2 0.4 Contour Contrast Contour Contrast 0.4 0.2 0.2 0.1 0.1 Thin Score 1 0.75 0.5 0.25 0 -180 -90 0 90 180 Orientation (deg) 0.05 0.1 0.05 0.2 0.1 0.2 Shading Contrast Shading Contrast !"##$%&'(")&*+(,- 0.2 1 0.1 Thin Score Contour Contrast 0.4 0.75 0.5 0.25 0 -180 -90 0 90 22&'&!"$ 8# 22&'&!"% 7! (# 6! 22&'&!"( $# )*+,*-.&)*+/,.01+, ! ! 180 Orientation (deg) 0.05 23041+5&)*+/,.01+, 9! 0.1 Shading Contrast 0.2 !"!# !"$ !"$# !"% ./"+01,&2$13#"43&.2 !"%# Cost functions We’ve touched on two of the three elements of Bayesian estimation and Bayesian decisionmaking: the likelihood and the prior. But, what about the third element: the cost function? Typical Task for Decision-Making Under Risk Would you rather have A. $480, or B. A 50-50 chance for $1,000? A choice between “lotteries”, where a lottery is a list of potential outcomes and their respective probabilities of occurence, e.g., (0.5, $0; 0.5, $1,000) Typical Task for Decision-Making Under Risk An Implicit (Motor) Decision Task World Cup 2002, semifinal: South Korea vs. Germany Would you rather have A. $480, or B. A 50-50 chance for $1,000? Typically, people choose A, showing risk-aversion for gains, and also show risk-seeking behavior for losses, along with many other “sub-optimal” behaviors, i.e., they don’t simply maximize expected gain. ! Ballack Ballack Experimental Task Start of trial: display of fixation cross (1.5 s) ! Experimental Task Ballack Experimental Task Experimental Task Target display (700 ms) ! ! Display of response area, 500 ms before target onset (114.2 mm x 80.6 mm) Experimental Task Experimental Task 100 ! Experimental Task ! 100 Experimental Task -500 ! The green target is hit: +100 points ! The red target is hit: -500 points -500 Experimental Task Experimental Task Scores add if both targets are hit: -500 100 ! ! Experimental Task Experimental Task ! -500 100 The screen is hit later than 700 ms after target display: -700 points. You are too slow: -700 Experimental Task Experimental Task Rapidly touch a point with your fingertip. Current score: 500 End of trial 0 0 Responding after the time limit: -700 points -500 0 100 0 0 0 What should you do? 18 mm Thought Experiment : -500 Thought Experiment : 100 points (2.5 ¢) : -500 : 100 points (2.5 ¢) 100 points 100 points 200 points y (mm) y (mm) 100 points x (mm) x (mm) " = 4.83 mm " = 4.83 mm Thought Experiment : 100 points (2.5 ¢) : -500 y (mm) 100 points 100 points 100 points 300 points 100 points 100 points 100 points -400 points -100 points x (mm) x (mm) " = 4.83 mm " = 4.83 mm Thought Experiment Thought Experiment : 100 points (2.5 ¢) : -500 : 100 points (2.5 ¢) - 0.3 pts. per trial 30.7 pts. per trial .... y (mm) 100 points 100 points 100 points -400 points y (mm) : -500 : 100 points (2.5 ¢) y (mm) : -500 Thought Experiment -0.3 pts. per trial x (mm) " = 4.83 mm x (mm) " = 4.83 mm Thought Experiment : 100 points (2.5 ¢) : -500 : 100 points (2.5 ¢) - 0.3 pts. per trial - 0.3 pts. per trial 30.7 pts. per trial 30.7 pts. per trial y (mm) y (mm) : -500 Thought Experiment 25.5 pts. per trial 25.5 pts. per trial 22.6 pts. per trial x (mm) x (mm) " = 4.83 mm " = 4.83 mm Thought Experiment Expected gain as function of mean movement end point (x,y): points per trial 10 penalty: 100 y x y 15 -5 -15 <-30 y x 0 -30 <-60 x, y: mean movement end point [mm] 0 5 10 15 20 x (mm) " = 4.83 mm target: 100 penalty: -500 y [mm] y [mm] 0 -10 x (mm) x (mm) Movement endpoints in response to novel stimulus configurations. 2 2 penalty conditions: 0 and -500 points (varied between blocks) 3 Results Comparison with experiment 4 R = 9 mm practice session: 300 trials, decreasing time limit 1 session of data collection: 360 trials 24 data points per condition Trommershäuser, Maloney & Landy (2003). JOSA A, 20,1419. x y (mm) 1 x (mm) " = 4.83 mm Experiment: Movement Under Risk 4 stimulus configurations: (varied within block) x 30 -0 penalty: 500 90 60 30 y (mm) y (mm) 5 -10 -5 points per trial penalty: 0 exp., penalty = 0 exp., penalty = 500 model, penalty = 500 x (mm) Subject S5, " = 2.99 mm Trommershäuser, Maloney & Landy (2003). JOSA A, 20,1419. Summary: Movement Under Risk Subjects’ movement endpoints match those that, for their motor variability and the experimenterimposed task conditions and risk, do maximize expected gain. Subjects appear to do this, even when confronting novel configurations, from the first trial, with no apparent learning. loss function? Subjects effectively take into account their own motor variability in planning movements. Trommershäuser, Maloney & Landy (2003). JOSA A, 20,1419. Estimating the human cost function: Körding & Wolpert (2004) A B Constructing the distribution example!l=0.3 loss function? -0.2cm probability loss function! 0 -2 0 absolute error loss 2 0 4 loss squared error 0 8 loss inverted Gaussian 0 -2 -1 0 1 error [cm] 2 appearing pea Target line 0 error [cm] 2 Finger xhand Starting sphere Estimating the human cost function: Körding & Wolpert (2004) Distributions and optimal means Nonparametric fit max 9 l!0.2 0 max Loss hits 1 probability probability probability probability loss D Possible loss functions m~0.4 xhand -0.2 +0.2/l!cm =0.47 cm max Estimating the human cost function: Körding & Wolpert (2004) C The experiment mean of pea Distribution l!0.3 0 max 0 max 0 -2 5 l!0.5 l!0.8 0 error [cm] 0 2 -3 0 error [cm] 3