A Variational Approach to Blind Image Deconvolution Rob Fergus Courant Institute of Mathematical Sciences New York University Overview • Much recent interest in blind deconvolution: • Levin ’06, Fergus et al. ’06, Jia et al. ’07, Joshi et al. ’08, Shan et al. ’08, Whyte et al. ’10 • This talk: – Discuss Fergus ‘06 algorithm – Try and give some insight into the variational methods used Removing Camera Shake from a Single Photograph Rob Fergus, Barun Singh, Aaron Hertzmann, Sam T. Roweis and William T. Freeman Massachusetts Institute of Technology and University of Toronto Image formation process = Blurry image Input to algorithm Blur kernel Sharp image Desired output Model is approximation Assume static scene & constant blur Convolution operator Why such a hard problem? Sharp image Blur kernel = = = Blurry image Statistics of gradients in natural images Image gradient is difference between spatially adjacent pixel intensities Log # pixels Histogram of image gradients Characteristic distribution with heavy tails Blurry images have different statistics Log # pixels Histogram of image gradients Parametric distribution Log # pixels Histogram of image gradients Use parametric model of sharp image statistics Three sources of information 1. Reconstruction constraint: Estimated sharp image = Estimated blur kernel 2. Image prior: Distribution of gradients Input blurry image 3. Blur prior: Positive & Sparse Three sources of information y = observed image b = blur kernel x = sharp image Three sources of information y = observed image b = blur kernel x = sharp image p( b; xjy) = k p( yjb; x) p( x) p( b) Posterior Three sources of information y = observed image b = blur kernel x = sharp image p( b; xjy) = k p( yjb; x) p( x) p( b) Posterior 1. Likelihood (Reconstruction constraint) 2. Image 3. Blur prior prior Overview of model y = observed image 1. Likelihood p( yjb; x) = b = blur x = sharp image Reconstruction constraint Q 2 i N ( yi jx i - b; ¾ ) i - pixel index 2. Image prior p( x) 3. Blur prior p( b) Mixture of Gaussians fit to empirical distribution of image gradients Exponential distribution to keep kernel +ve & sparse The obvious thing to do p( b; xjy) = k p( yjb; x) p( x) p( b) Posterior 1. Likelihood (Reconstruction constraint) 2. Image 3. Blur prior prior – Combine 3 terms into an objective function – Run conjugate gradient descent – This is Maximum a-Posteriori (MAP) No success! Variational Independent Component Analysis Miskin and Mackay, 2000 • Binary images • Priors on intensities • Small, synthetic blurs • Not applicable to natural images Variational Bayesian approach Keeps track of uncertainty in estimates of image and blur by using a distribution instead of a single estimate Score Toy illustration: Optimization surface for a single variable Maximum a-Posteriori (MAP) Variational Bayes Pixel intensity Simple 1-D blind deconvolution example y = bx + n y = observed image b = blur x = sharp image n = noise ~ N(0,σ2) Let y = 2 p( b; xjy) = k p( yjb; x) p( x) p( b) Let y = 2 σ2 = 0.1 2 N ( yjbx; ¾ ) p( b; xjy) = k p( yjb; x) p( x) p( b) Gaussian distribution: N ( xj0; 2) p( b; xjy) = k p( yjb; x) p( x) p( b) Marginal distribution p(b|y) R p( b; xjy) dx = k p( yjb; x) p( x) dx 0.16 0.14 0.12 Bayes p(b|y) p( bjy) = R 0.1 0.08 0.06 0.04 0.02 0 0 1 2 3 4 5 6 b 7 8 9 10 MAP solution Highest point on surface: ar gmax b;x p( x; bjy) 0.16 0.14 Bayes p(b|y) 0.12 0.1 0.08 0.06 0.04 0.02 0 0 1 2 3 4 5 6 b 7 8 9 10 Variational Bayes • True Bayesian approach not tractable • Approximate posterior with simple distribution Fitting posterior with a Gaussian • Approximating distribution q( x; b) is Gaussian • Minimize K L ( q( x; b) jj p( x; bjy) ) KL-Distance vs Gaussian width 11 10 KL(q||p) 9 8 7 6 5 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Variational Approximation of Marginal 2.5 Variational 2 True marginal p(b|y) 1.5 MAP 1 0.5 0 0 1 2 3 4 5 6 7 8 9 10 Try sampling from the model Let true b = 2 Repeat: • Sample x ~ N(0,2) • Sample n ~ N(0,σ2) • y = xb + n • Compute pMAP(b|y), pBayes(b|y) & pVariational(b|y) • Multiply with existing density estimates (assume iid) Actual Setup of Variational Approach Work in gradient domain: x - b= y ! r x - b= r y Approximate posterior with q( r x; b) Assume p( r x; bjr y) q( r x; b) = q( r x) q( b) q( r x) is Gaussian on each pixel q( b) is rectified Gaussian on each blur kernel element Cost function K L ( q( r x) q( b) jj p( r x; bjr y) ) Close-up • Original • Output Original photograph Blur kernel Our output Close-up Original image Our output Blur kernel Recent Comparison paper IEEE CVPR 2009 conference Percent success (higher is better) Related Problems • Bayesian Color Constancy – Brainard & Freeman • JOSA 1997 – Given color pixels, deduce: • Reflectance Spectrum of surface • Illuminant Spectrum – Use Laplace approximation – Similar to Gaussian q(.) From Foundation of Vision by Brian Wandell, Sinauer Associates, 1995 Conclusions • Variational methods seem to do the right thing for blind deconvolution. • Interesting from inference point of view: rare case where Bayesian methods are needed • Can potentially be applied to other ill-posed problems in image processing