776 Computer Vision Jared Heinly Spring 2014 (slides borrowed from Jan-Michael Frahm, Svetlana Lazebnik, and others) Image alignment Image from http://graphics.cs.cmu.edu/courses/15-463/2010_fall/ A look into the past http://blog.flickr.net/en/2010/01/27/a-look-into-the-past/ A look into the past • Leningrad during the blockade http://komen-dant.livejournal.com/345684.html Bing streetside images http://www.bing.com/community/blogs/maps/archive/2010/01/12/new-bing-mapsapplication-streetside-photos.aspx Image alignment: Applications Panorama stitching Recognition of object instances Image alignment: Challenges Small degree of overlap Intensity changes Occlusion, clutter Image alignment • Two families of approaches: o Direct (pixel-based) alignment • Search for alignment where most pixels agree o Feature-based alignment • Search for alignment where extracted features agree • Can be verified using pixel-based alignment 2D transformation models • Similarity (translation, scale, rotation) • Affine • Projective (homography) Let’s start with affine transformations • Simple fitting procedure (linear least squares) • Approximates viewpoint changes for roughly planar objects and roughly orthographic cameras • Can be used to initialize fitting for more complex models Fitting an affine transformation • Assume we know the correspondences, how do we get the transformation? ( xi , yi ) xi m1 y m i 3 m2 xi t1 m4 yi t2 ( xi, yi) x i 0 yi 0 0 0 xi yi m1 m2 1 0 m3 xi 0 1 m4 yi t1 t 2 Fitting an affine transformation x i 0 yi 0 0 0 xi yi m1 m2 1 0 m3 xi 0 1 m4 yi t1 t 2 • Linear system with six unknowns • Each match gives us two linearly independent equations: need at least three to solve for the transformation parameters Fitting a plane projective transformation • Homography: plane projective transformation (transformation taking a quad to another arbitrary quad) Homography • The transformation between two views of a planar surface • The transformation between images from two cameras that share the same center Application: Panorama stitching Source: Hartley & Zisserman Fitting a homography • Recall: homogeneous coordinates Converting to homogeneous image coordinates Converting from homogeneous image coordinates Fitting a homography • Recall: homogeneous coordinates Converting to homogeneous image coordinates Converting from homogeneous image coordinates • Equation for homography: x h11 h12 y h21 h22 1 h31 h32 h13 x h23 y h33 1 Fitting a homography • Equation for homography: xi h11 h12 yi h21 h22 1 h31 h32 Unknown scale h13 xi h23 yi h33 1 Rows of H xi H xi Scaled versions of same vector xi H xi 0 T T T x h x y h x h i 1 i i 3 i 2 xi y hT x hT x x hT x i 2 i 1 i i 3 i 1 hT3 x i xi hT2 x i yi h1T x i Each element is a 3-vector 3 equations, only 2 linearly independent xi' 0T T yi' x i yi xTi x 0T xi xTi T i Rows of H formed into yi xTi h1 9x1 column vector T xi x i h 2 0 0T h 3 Direct linear transform 0T T x1 T 0 xT n x1T y1 x1T T T 0 x1 x1 h1 h 2 0 T T x n yn x n h 3 0T xn xTn Ah 0 • H has 8 degrees of freedom (9 parameters, but scale is arbitrary) • One match gives us two linearly independent equations • Four matches needed for a minimal solution (null space of 8x9 matrix) • More than four: homogeneous least squares Robust feature-based alignment • So far, we’ve assumed that we are given a set of “ground-truth” correspondences between the two images we want to align • What if we don’t know the correspondences? ( xi , yi ) ( xi, yi) Robust feature-based alignment • So far, we’ve assumed that we are given a set of “ground-truth” correspondences between the two images we want to align • What if we don’t know the correspondences? ? Robust feature-based alignment Robust feature-based alignment • Extract features Robust feature-based alignment • Extract features • Compute putative matches Alignment as fitting • Least Squares • Total Least Squares Least Squares Line Fitting •Data: (x1, y1), …, (xn, yn) •Line equation: yi = m xi + b •Find (m, b) to minimize E i 1 ( yi m xi b) 2 n y=mx+b (xi, yi) Problem with “Vertical” Least Squares • Not rotation-invariant • Fails completely for vertical lines Total Least Squares •Distance between point (xi, yi) and line ax+by=d (a2+b2=1): |axi + byi – d| •Find (a, b, d) to minimize the sum of squared perpendicular distances E i 1 (a xi b yi d ) 2 n ax+by=d Unit normal: N=(a, b) E (a x(x b y, yd )) i i n i 1 2 i i Least Squares: Robustness to Noise • Least squares fit to the red points: Least Squares: Robustness to Noise • Least squares fit with an outlier: Problem: squared error heavily penalizes outliers Robust Estimators • General approach: find model parameters θ that minimize • i ri xi , ; ri (xi, θ) – residual of ith point w.r.t. model parameters θ ρ – robust function with scale parameter σ The robust function ρ behaves like squared distance for small values of the residual u but saturates for larger values of u Choosing the scale: Just right The effect of the outlier is minimized Choosing the scale: Too small The error value is almost the same for every point and the fit is very poor Choosing the scale: Too large Behaves much the same as least squares RANSAC • Robust fitting can deal with a few outliers – what if we have very many? • Random sample consensus (RANSAC): Very general framework for model fitting in the presence of outliers • Outline o Choose a small subset of points uniformly at random o Fit a model to that subset o Find all remaining points that are “close” to the model and reject the rest as outliers o Do this many times and choose the best model M. A. Fischler, R. C. Bolles. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Comm. of the ACM, Vol 24, pp 381-395, 1981. RANSAC for line fitting example Source: R. Raguram RANSAC for line fitting example Least-squares fit Source: R. Raguram RANSAC for line fitting example 1. Randomly select minimal subset of points Source: R. Raguram RANSAC for line fitting example 1. Randomly select minimal subset of points 2. Hypothesize a model Source: R. Raguram RANSAC for line fitting example 1. Randomly select minimal subset of points 2. Hypothesize a model 3. Compute error function Source: R. Raguram RANSAC for line fitting example 1. Randomly select minimal subset of points 2. Hypothesize a model 3. Compute error function 4. Select points consistent with model Source: R. Raguram RANSAC for line fitting example 1. Randomly select minimal subset of points 2. Hypothesize a model 3. Compute error function 4. Select points consistent with model 5. Repeat hypothesize-andverify loop Source: R. Raguram RANSAC for line fitting example 1. Randomly select minimal subset of points 2. Hypothesize a model 3. Compute error function 4. Select points consistent with model 5. Repeat hypothesize-andverify loop 43 Source: R. Raguram RANSAC for line fitting example Uncontaminated sample 1. Randomly select minimal subset of points 2. Hypothesize a model 3. Compute error function 4. Select points consistent with model 5. Repeat hypothesize-andverify loop 44 Source: R. Raguram RANSAC for line fitting example 1. Randomly select minimal subset of points 2. Hypothesize a model 3. Compute error function 4. Select points consistent with model 5. Repeat hypothesize-andverify loop Source: R. Raguram RANSAC for line fitting • • • • • Repeat N times: Draw s points uniformly at random Fit line to these s points Find inliers to this line among the remaining points (i.e., points whose distance from the line is less than t) If there are d or more inliers, accept the line and refit using all inliers Choosing the parameters • Initial number of points s o Typically minimum number needed to fit the model • Distance threshold t o Choose t so probability for inlier is p (e.g. 0.95) o Zero-mean Gaussian noise with std. dev. σ: t2=3.84σ2 • Number of samples N o Choose N so that, with probability p, at least one random sample is free from outliers (e.g. p=0.99) (outlier ratio: e) Source: M. Pollefeys Choosing the parameters • Initial number of points s • Typically minimum number needed to fit the model • Distance threshold t • Choose t so probability for inlier is p (e.g. 0.95) • Zero-mean Gaussian noise with std. dev. σ: t2=3.84σ2 • Number of samples N • Choose N so that, with probability p, at least one random sample is free from outliers (e.g. p=0.99) (outlier ratio: e) 1 1 e s N proportion of outliers e 1 p N log 1 p / log 1 1 e s s 2 3 4 5 6 7 8 5% 2 3 3 4 4 4 5 10% 3 4 5 6 7 8 9 20% 25% 30% 40% 50% 5 6 7 11 17 7 9 11 19 35 9 13 17 34 72 12 17 26 57 146 16 24 37 97 293 20 33 54 163 588 26 44 78 272 1177 Source: M. Pollefeys Adaptively determining the number of samples • Inlier ratio e is often unknown a priori, so pick worst case, e.g. 50%, and adapt if more inliers are found, e.g. 80% would yield e=0.2 • Adaptive procedure: o N=∞, sample_count =0 o While N >sample_count • Choose a sample and count the number of inliers • If inlier ratio is highest of any found so far, set e = 1 – (number of inliers)/(total number of points) • Recompute N from e: N log 1 p / log 1 1 e s • Increment the sample_count by 1 Source: M. Pollefeys RANSAC pros and cons • Pros o Simple and general o Applicable to many different problems o Often works well in practice • Cons o Lots of parameters to tune o Doesn’t work well for low inlier ratios (too many iterations, or can fail completely) o Can’t always get a good initialization of the model based on the minimum number of samples Alignment as fitting • Previous lectures: fitting a model to features in one image M xi Find model M that minimizes residual ( x , M ) i i Alignment as fitting • Previous lectures: fitting a model to features in one image M Find model M that minimizes xi residual ( x , M ) i i • Alignment: fitting a model to a transformation between pairs of features (matches) in two images xi T xi' Find transformation T that minimizes residual (T ( x ), x) i i i Robust feature-based alignment • Extract features • Compute putative matches • Loop: o Hypothesize transformation T Robust feature-based alignment • Extract features • Compute putative matches • Loop: o o Hypothesize transformation T Verify transformation (search for other matches consistent with T) Robust feature-based alignment • Extract features • Compute putative matches • Loop: o o Hypothesize transformation T Verify transformation (search for other matches consistent with T) RANSAC • The set of putative matches contains a very high percentage of outliers • 1. 2. 3. 4. RANSAC loop: Randomly select a seed group of matches Compute transformation from seed group Find inliers to this transformation If the number of inliers is sufficiently large, recompute least-squares estimate of transformation on all of the inliers • Keep the transformation with the largest number of inliers RANSAC example: Translation Putative matches RANSAC example: Translation Select one match, count inliers RANSAC example: Translation Select one match, count inliers RANSAC example: Translation Select translation with the most inliers Summary • • • • Detect feature points (to be covered later) Describe feature points (to be covered later) Match feature points (to be covered later) RANSAC o Transformation model (affine, homography, etc.) Advice • Use the minimum model necessary to define the transformation on your data o Affine in favor of homography o Homography in favor of 3D transform (not discussed yet) o Make selection based on your data • Fewer RANSAC iterations • Fewer degrees of freedom less ambiguity or sensitivity to noise Generating putative correspondences ? Generating putative correspondences ? () feature descriptor ? = () feature descriptor • Need to compare feature descriptors of local patches surrounding interest points Feature descriptors • Simplest descriptor: vector of raw intensity values • How to compare two such vectors? o Sum of squared differences (SSD) SSD(u , v) ui vi 2 i • Not invariant to intensity change o Normalized correlation (u, v) (u i i u )(vi v ) (u j u ) 2 (v j v ) 2 j j • Invariant to affine intensity change Disadvantage of intensity vectors as descriptors • Small deformations can affect the matching score a lot Feature descriptors: SIFT • Descriptor computation: o Divide patch into 4x4 sub-patches o Compute histogram of gradient orientations (8 reference angles) inside each sub-patch o Resulting descriptor: 4x4x8 = 128 dimensions David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004. Feature descriptors: SIFT • Descriptor computation: o Divide patch into 4x4 sub-patches o Compute histogram of gradient orientations (8 reference angles) inside each sub-patch o Resulting descriptor: 4x4x8 = 128 dimensions • Advantage over raw vectors of pixel values o Gradients less sensitive to illumination change o Pooling of gradients over the sub-patches achieves robustness to small shifts, but still preserves some spatial information David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004. Feature matching • Generating putative matches: for each patch in one image, find a short list of patches in the other image that could match it based solely on appearance ? Feature space outlier rejection • How can we tell which putative matches are more reliable? • Heuristic: compare distance of nearest neighbor to that of second nearest neighbor o Ratio of closest distance to second-closest distance will be high for features that are not distinctive Threshold of 0.8 provides good separation David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004.