Learning to Perceive Transparency from the Statistics of Natural Scenes Anat Levin School of Computer Science and Engineering The Hebrew University of Jerusalem Joint work with Assaf Zomet and Yair Weiss Transparency two layers : I I 1 I 2 one layer : I I 1 0 How does our visual system choose the right decomposition? I ( x, y) I ( x, y) I ( x, y) 1 2 •Why not “simpler” one layer solution? •Which two layers out of infinite possibilities? Talk Outlines •Motivation and previous work •Our approach •Results and future work Transparency in the real world “Fashion Planet's photographers have spent the last five years working to bring you clean photographs of the windows on New York especially without the reflections that usually occur in such photography” http://www.fashionplanet.com/sept98/features/reflections/home Transparency and shading I ( x, y) I 1 ( x, y) I 2 ( x, y) I ( x, y ) L ( x, y ) R ( x, y ) Transparency in human vision One layer Two layers • Metelli's conditions (Metelli 74) •T-junctions, X-junctions, doubly reversing junctions (Adelson and Anandan 90, Anderson 99) Not obvious how to apply “junction catalogs” to real images. Transparency from multiple frames •Two frames with polarizer using ICA (Farid and Adelson 99, Zibulevsky 02) •Multiple frames with specific motions (Irani et al. 94, Szeliski et al. 00, Weiss 01) Shading from a single frame I ( x, y) I 1 ( x, y) I 2 ( x, y) I ( x, y ) L ( x, y ) R ( x, y ) •Retinex (Land and McCann 71). •Color (Drew, Finlayson Hordley 02) •Learning approach (Tappen, Freeman Adelson 02) Talk Outlines •Motivation and previous work •Our approach •Results and future work Our Approach I ( x, y) I 1 ( x, y) I 2 ( x, y) Ill-posed problem. Assume probability distribution Pr(I1), Pr(I2) and search for most probable solution. (ICA with a single microphone) Statistics of natural scenes Input image dx histogram dx Log histogram Statistics of derivative filters 0 Log Probability Gaussian –x2 –x Laplacian –|x| 1/2 -1 Log histogram Generalized Gaussian distribution (Mallat 89, Simoncelli 95) p ( x) e x / s , 1 Is sparsity enough? = + = + Or: Is sparsity enough? = + = + Or: Exactly the same derivatives exist in the single layer solution as in the two layers solution. Beyond sparseness • Higher order statistics of filter outputs (e.g. Portilla and Simoncelli 2000). •Marginals of more complicated feature detectors (e.g. Zhu and Mumford 97, Della Pietra Della Pietra and Lafferty 96). Corners and transparency = + •In typical images, edges are sparse. •Adding typical images is expected to increase the number of corners. •Not true for white noise Harris-like operator I x2 ( x, y ) I x ( x, y ) I y ( x, y ) c( x0 , y0 ) det w( x, y ) 2 I ( x , y ) I ( x , y ) I ( x , y ) x y y Corner histograms Derivative Filter Corner Operator Fitting: Typical exponents for natural images: p( x) e x / s1 p ( x) e x / s2 Derivative Filter Corner Operator 0.7 0.2 s1 / s2 1 Simple prior for transparency prediction The probability of a decomposition I I1 I 2 log P( I x , I y ) log P( I , I ) log P( I , I ) 1 x 1 y 2 x 2 y log P( x, y) log Z I ( x, y) c( x, y) x, y 0.7, 0.2, s1 / s2 1 Does this predict transparency? log P( x, y) log Z I ( x, y) c( x, y) x, y I I1 I I1 How important are the statistics? log P( x, y) log Z I ( x, y) c( x, y) x, y 0.7, 0.2, 1 Is it important that the statistics are non Gaussian? Would any cost that penalized high gradients and corners work? The importance of being non Gaussian log P( x, y ) log Z I ( x, y ) c( x, y ) x, y I I1 0.7, 0.2 I I1 2, 2 The “scalar transparency” problem a b 1, with a 0, b 0 Consider a prior over positive scalars p( x) e For which priors is the MAP solution sparse? x The “scalar transparency” problem a b 1, with a 0, b 0 Observation: The MAP solution is obtained with a=0, b=1 or a=1, b=0 if and only if f(x)=log P(x) is convex. f( f (a ) f (b) 2 f( ab ) 2 0 0.5 1 MAP solution: a=0, b=1 ab ) 2 f (a ) f (b) 2 0 0.5 1 MAP solution: a=0.5, b=0.5 The importance of being non Gaussian 0 I 0.5 1 I1 0.7, 0.2 0 I 0.5 1 I1 2, 2 log P( x, y ) log Z I ( x, y ) c( x, y ) x, y I I1 I I1 Can we perform a global optimization? Conversion to discrete MRF g1 g2 g3 g4 g5 g6 g7 g8 g9 g10 g11 g12 g13 g14 g15 For the decomposition: I I1 I 2, g i ( I ix1 , I iy1 ), I1 gradient at location i f i ( I ix , I iy ) g i Local Potential- derivative filters: i ( gi ) e gi fi Pairwise Potential- pairwise approximation to the corner operator: i, j ( gi , g j ) e i , j ,k ( g i , g j , g k ) det( g i g iT g j g Tj ) det( f i f iT f j f jT ) -Enforcing integrability Conversion to discrete MRF For the decomposition: I I1 I 2, gi ( I ix1 , I iy1 ), discretiza tion of I1 gradient at location i. fi ( I ix , I iy ) gi 1 P( g ) i ( gi ) i , j ( gi , g j ) i , j ,k ( gi , g j , g k ) Z i i, j i , j ,k Local Potential- derivative filters: i ( gi ) e gi fi Pairwise Potential- pairwise approximation to the corner operator: i, j ( gi , g j ) e i , j ,k ( g i , g j , g k ) det( g i g iT g j g Tj ) det( f i f iT f j f jT ) -Integrability enforcing Optimizing discrete MRF I I1 I 2, g i ( I ix1 , I iy1 ), discretiza tion of I1 gradient at location i. f i ( I ix , I iy ) g i , discretiza tion of I 2 gradient at location i. 1 P( g ) i ( gi ) i , j ( gi , g j ) i , j ,k ( gi , g j , g k ) Z i i, j i , j ,k g N possible assignments. Solution: use max-product belief propagation. The MRF has many cycles but BP works in similar problems (Freeman and Pasztor 99, Frey et al 2001. Sun et al 2002). Converges to strong local minimum (Weiss and Freeman 2001) Drawbacks of BP for this problem •Large memory and time complexity. •Convergence depends on update order. •Discretization artifacts Talk Outlines •Motivation and previous work •Our approach •Results and future work Results input Output layer 1 Output layer 2 Results input Output layer 1 Output layer 2 Future Work •Dealing with a more complex texture + Original = Non linear filter Future Work •Dealing with a more complex texture: •A coarse qualitative separation. •Learn discriminative features automatically •Applying other optimization methods. •Extend to shading and illumination. •Use application specific priors (e.g. Manhattan World) Conclusions •Natural scene statistics predict perception of transparency. •First algorithm that can decompose a single image into the sum of two images.