Go With The Flow A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan Rice University Richard G. Baraniuk Chinmay Hegde Sriram Nagaraj Sensor Data Deluge Concise Models • Efficient processing / compression requires concise representation • Sparsity of an individual image pixels large wavelet coefficients (blue = 0) Concise Models • Efficient processing / compression requires concise representation • Our interest in this talk: Collections of images Concise Models • Our interest in this talk: Collections of image parameterized by q \in Q – translations of an object q: x-offset and y-offset – wedgelets q: orientation and offset – rotations of a 3D object q: pitch, roll, yaw Concise Models • Our interest in this talk: Collections of image parameterized by q \in Q – translations of an object q: x-offset and y-offset – wedgelets q: orientation and offset – rotations of a 3D object q: pitch, roll, yaw • Image articulation manifold Image Articulation Manifold • N-pixel images: • K-dimensional articulation space • Then is a K-dimensional manifold in the ambient space • Very concise model articulation parameter space Smooth IAMs • N-pixel images: • Local isometry: image distance parameter space distance • Linear tangent spaces are close approximation locally articulation parameter space Smooth IAMs • N-pixel images: • Local isometry: image distance parameter space distance • Linear tangent spaces are close approximation locally articulation parameter space Ex: Manifold Learning LLE ISOMAP LE HE Diff. Geo … • K=1 rotation Ex: Manifold Learning • K=2 rotation and scale Theory/Practice Disconnect: Smoothness • Practical image manifolds are not smooth! • If images have sharp edges, then manifold is everywhere non-differentiable [Donoho and Grimes] Tangent approximations ? Isometry ? Theory/Practice Disconnect: Smoothness • Practical image manifolds are not smooth! • If images have sharp edges, then manifold is everywhere non-differentiable [Donoho and Grimes] Tangent approximations ? Isometry ? Failure of Tangent Plane Approx. • Ex: cross-fading when synthesizing / interpolating images that should lie on manifold Input Image Input Image Geodesic Linear path Failure of Local Isometry • Ex: translation manifold all blue images are equidistant from the red image 3.5 • Local isometry – satisfied only when sampling is dense Euclidean distance 3 2.5 2 1.5 1 0.5 0 0 20 40 60 Translation q in [px] 80 100 Tools for manifold processing Algebraic manifolds Data manifolds Geodesics, exponential maps, log-maps, Riemannian metrics, Karcher means, … LLE, kNN graphs Smooth diff manifold Point cloud model Beyond point cloud model for image manifolds The concept of Transport operators I q f I q ref Example I q ( x ) I q ref ( f ( x )) Example: Translation • 2D Translation manifold I1 I0 I 1 ( x ) I 0 ( x 1 ), f1 ( x ) x 1 barring boundary related issues • Set of all transport operators = R 2 • Beyond a point cloud model – Action of the articulation is more accurate and meaningful Optical Flow • Generalizing this idea: Pixel correspondances I1 and I2 • Idea: OF from I1 to I2 OF between two images is a natural and accurate transport operator (Figures from Ce Liu’s optical flow page) Optical Flow Transport • Consider a reference image and a K-dimensional articulation • Collect optical flows from to all images reachable by a K-dimensional articulation IAM Iq0 I q1 Iq2 q0 q1 Articulations q2 Optical Flow Transport • Consider a reference image and a K-dimensional articulation • Collect optical flows from to all images reachable by a K-dimensional articulation • Theorem: Collection of OFs is a smooth, K-dimensional manifold (even if IAM is not smooth) OFM at I q 0 IAM Iq0 I q1 Iq2 q0 q1 Articulations q2 OFM is Smooth Pixel intensity at 3 points Intensity I(θ) Intensity q 60 Optical in pixels vx v(θ) flow Op. flow q 0 0 q 30 q 60 200 150 100 50 0 Flow (nearly linear) q 30 (Rotation) 20 -80 -60 -40 -80 -60 -40 -20 0 20 40 60 80 -20 0 20 Articulation q in [] 40 60 80 Articulation q in [] 10 0 -10 -20 Articulation θ in [⁰] Main results • Local model at each I q OFM at I q 0 0 • Each point on the OFM defines a transport operator – Each transport operator maps I q to one of its neighbors I q1 • For a large class of articulations, OFMs are smooth and locally isometric – Traditional manifold processing techniques work on OFMs IAM Iq0 0 Iq2 q0 q1 Articulations q2 Linking it all together Nonlinear dim. reduction OFM at I q0 e0 e1 IAM Iq0 I q1 Iq2 Articulations The non-differentiablity does not dissappear --- it is embedded in the mapping from OFM to the IAM. However, this is a known map q0 q1 e2 q2 { I ref , f q } I q such that I q ( x ) I ref ( f q ( x )) The Story So Far… Tangent space at IAM Iq0 I q1 OFM at I q Iq0 Iq2 Articulations IAM Iq0 I q1 q0 q1 0 Iq2 q0 q2 q1 Articulations q2 Input Image IAM OFM Input Image Geodesic Linear path OFM Synthesis Manifold Learning 2D rotations ISOMAP embedding error for OFM and IAM Reference image Manifold Learning 2D rotations Embedding of OFM Two-dimensional Isomap embedding (with neighborhood graph). 5 4 Reference image 3 2 1 0 -1 -2 -3 -4 -5 -6 -4 -2 0 2 4 6 OFM Manifold Learning Data 196 images of two bears moving linearly and independently Task Find low-dimensional embedding IAM OFM OFM ML + Parameter Estimation Data 196 images of a cup moving on a plane Task 1 Find low-dimensional embedding Task 2 Parameter estimation for new images (tracing an “R”) IAM OFM Karcher Mean • Point on the manifold such that the sum of geodesic distances to every other point is minimized • Important concept in nonlinear data modeling, compression, shape analysis [Srivastava et al] 10 images from an IAM ground truth KM OFM KM linear KM Manifold Charting • Goal: build a generative model for an entire IAM/OFM based on a small number of base images • Ex: cube rotating about axis. All cube images can be representing using 4 reference images + OFMs IAM Optimal charting q 180 q 90 q 0 q 90 Greedy charting • Many applications – selection of target templates for classification – “next-view” selection for adaptive sensing applications q 180 Summary • IAMs a useful concise model for many image processing problems involving image collections and multiple sensors/viewpoints • But practical IAMs are non-differentiable – IAM-based algorithms have not lived up to their promise • Optical flow manifolds (OFMs) – smooth even when IAM is not – OFM ~ nonlinear tangent space – support accurate image synthesis, learning, charting, … • Barely discussed here: OF enables the safe extension of differential geometry concepts – Log/Exp maps, Karcher mean, parallel transport, … Open Questions • Our treatment is specific to image manifolds under brightness constancy • What are the natural transport operators for other data manifolds? dsp.rice.edu Related Work • Analytic transport operators – transport operator has group structure [Xiao and Rao 07][Culpepper and Olshausen 09] [Miller and Younes 01] [Tuzel et al 08] – non-linear analytics [Dollar et al 06] – spatio-temporal manifolds [Li and Chellappa 10] – shape manifolds [Klassen et al 04] • Analytic approach limited to a small class of standard image transformations (ex: affine transformations, Lie groups) • In contrast, OFM approach works reliably with real-world image samples (point clouds) and broader class of transformations Limitations • Brightness constancy – Optical flow is no longer meaningful • Occlusion – Undefined pixel flow in theory, arbitrary flow estimates in practice – Heuristics to deal with it • Changing backgrounds etc. – Transport operator assumption too strict – Sparse correspondences ? Open Questions • Theorem: random measurements stably embed a K-dim manifold whp [B, Wakin, FOCM ’08] • Q: Is there an analogous result for OFMs? Image Articulation Manifold • Linear tangent space at is K-dimensional Tangent space at – provides a mechanism to transport along manifold – problem: since manifold is non-differentiable, tangent approximation is poor Iq0 IAM Iq0 I q1 Iq2 q0 • Our goal: replace tangent space with new transport operator that respects the nonlinearity of the imaging process q1 Articulations q2 OFM Implementation details Reference Image Pairwise distances and embedding 100 4 200 300 2 400 0 0.5 100 200 300 400 Residual variance -2 0.4 -4 0.3 -5 0.2 0.1 0 0 2 4 6 8 Isomap dimensionality 10 0 5 Flow Embedding 100 200 300 400 100 200 300 400 4 0.5 Residual variance 2 0.4 0 0.3 0.2 -2 0.1 0 0 2 4 6 8 Isomap dimensionality 10 -4 -5 0 5 Occlusion • Detect occlusion using forward-backward flow reasoning Occluded • Remove occluded pixel computations • Heuristic --- formal occlusion handling is hard History of Optical Flow • Dark ages (<1985) – special cases solved – LBC an under-determined set of linear equations • Horn and Schunk (1985) – Regularization term: smoothness prior on the flow • Brox et al (2005) – shows that linearization of brightness constancy (BC) is a bad assumption – develops optimization framework to handle BC directly • Brox et al (2010), Black et al (2010), Liu et al (2010) – practical systems with reliable code