Recognition and Matching based on local invariant features Cordelia Schmid INRIA, Grenoble David Lowe Univ. of British Columbia Introduction Local invariant photometric descriptors () local descriptor Local : robust to occlusion/clutter + no segmentation Photometric : distinctive Invariant : to image transformations + illumination changes History - Matching Matching based on line segments Not very discriminant Solution : matching with interest points & correlation [ A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry, Z. Zhang, R. Deriche, O. Faugeras and Q. Luong, Artificial Intelligence 1995 ] Approach • Extraction of interest points with the Harris detector • Comparison of points with cross-correlation • Verification with the fundamental matrix Harris detector Interest points extracted with Harris (~ 500 points) Cross-correlation matching Initial matches (188 pairs) Global constraints Robust estimation of the fundamental matrix 99 inliers 89 outliers Summary of the approach • Very good results in the presence of occlusion and clutter – – – – local information discriminant greyvalue information robust estimation of the global relation between images for limited view point changes • Solution for more general view point changes – wide baseline matching (different viewpoint, scale and rotation) – local invariant descriptors based on greyvalue information History - Recognition Color histogram [Swain 91] Each pixel is described by a color vector r g b Distribution of color vectors is described by a histogram => not robust to occlusion, not invariant, not distinctive History - Recognition Eigenimages [Turk 91] • Each face vector is represented in the eigenimage space – eigenvectors with the highest eigenvalues = eigenimages .. . . v2 v1 v3 • The new image is projected into the eigenimage space – determine the closest face not robust to occlusion, requires segmentation, not invariant, discriminant History - Recognition Geometric invariants [Rothwell 92] • Function with a value independent of the transformation f ( x, y) f ( x, y) where ( x, y)t T ( x, y)t • Invariant for image rotation : distance of two points • Invariant for planar homography : cross-ratio => local and invariant, not discriminant, requires sub-pixel extraction of primitives History - Recognition Problems : occlusion, clutter, image transformations, distinctiveness Solution : recognition with local photometric invariants [ Local greyvalue invariants for image retrieval, C. Schmid and R. Mohr, PAMI 1997 ] Approach () local descriptor 1) Extraction of interest points (characteristic locations) 2) Computation of local descriptors 3) Determining correspondences 4) Selection of similar images Interest points Geometric features repeatable under transformations 2D characteristics of the signal high informational content Comparison of different detectors [Schmid98] Harris detector Harris detector Based on the idea of auto-correlation Important difference in all directions => interest point Harris detector Auto-correlation function for a point ( x, y ) and a shift (x, y ) f ( x, y) 2 ( I ( x , y ) I ( x x , y y )) k k k k ( xk , yk )W Discret shifts can be avoided with the auto-correlation matrix x with I ( xk x, yk y ) I ( xk , yk ) ( I x ( xk , yk ) I y ( xk , yk )) y x f ( x, y ) I x ( xk , yk ) I y ( xk , yk ) ( xk , yk )W y 2 Harris detector ( I x ( xk , yk )) 2 x y ( xk , yk )W I x ( xk , yk ) I y ( xk , yk ) ( xk , y k )W I ( x , y ) I ( x , y ) x ( I ( x , y )) y x ( xk , y k )W k k y k k 2 ( xk , yk )W Auto-correlation matrix y k k Harris detection • Auto-correlation matrix – captures the structure of the local neighborhood – measure based on eigenvalues of this matrix • 2 strong eigenvalues => interest point • 1 strong eigenvalue => contour • 0 eigenvalue => uniform region • Interest point detection – threshold on the eigenvalues – local maximum for localization Local descriptors () local descriptor Descriptors characterize the local neighborhood of a point Local descriptors Greyvalue derivatives I ( x, y ) G ( ) I ( x, y ) Gx ( ) I ( x, y ) * G ( ) y v( x, y ) I ( x, y ) * Gxx ( ) I ( x, y ) * G ( ) xy I ( x, y ) * G yy ( ) I ( x, y ) G ( ) G( x, y) I ( x x, y y)dxdy ( x, y)t t G(( x, y) , ) exp( ) 2 2 2 2 1 2 Local descriptors Invariance to image rotation : differential invariants [Koen87] L L Li Li Lx Lx Ly Ly Li Lij L j Lxx Lx Lx 2 Lxy Lx Ly Lyy Lyy Lii Lxx Lyy Lxx Lxx 2 Lxy Lxy Lyy Lyy Lij Lij ( L L L L L L L L ) jkk i l l ij jkl i k l Liij L j Lk Lk Lijk Li L j Ll L L L L ij jkl i k l Lijk Li L j Lk where ij is the antisymmet ric epsilon te nsor Local descriptors Robustness to illumination changes In case of an affine transformation I1 (x) aI 2 (x) b Li Lij L j ( Li Li ) 3 / 2 L ii 1/ 2 ( Li Li ) Lij L ji Li Li ij ( L jkl Li Lk Ll L jkk Li Ll Ll ) 2 ( L L ) i i Liij L j Lk Lk Lijk Li L j Lk ) ( Li Li ) 2 L L L L ij jkl i k l 2 ( Li Li ) L L L L ijk i j k 2 ( Li Li ) Local descriptors Robustness to illumination changes In case of an affine transformation I1 (x) aI 2 (x) b or normalization of the image patch with mean and variance Determining correspondences () ? = () Vector comparison using the Mahalanobis distance dist M (p, q) (p q)T 1 (p q) Selection of similar images • In a large database – voting algorithm – additional constraints • Rapid acces with an indexing mechanism Voting algorithm () vector of local characteristics I1 I1 I 2 I 2 In Voting algorithm I1 I1 I 2 I 2 21 1 In 01 I 1 is the corresponding model image Additional constraints • Semi-local constraints – neighboring points should match – angles, length ratios should be similar 1 1 1 2 ~2 2 3 ~1 2 3 • Global constraints – robust estimation of the image transformation (homogaphy, epipolar geometry) Results database with ~1000 images Results Results Summary of the approach • Very good results in the presence of occlusion and clutter – local information – discriminant greyvalue information – invariance to image rotation and illumination • Not invariance to scale and affine changes • Solution for more general view point changes – local invariant descriptors to scale and rotation – extraction of invariant points and regions Approach for Matching and Recognition • Detection of interest points/regions – Harris detector (extension to scale and affine invariance) – Blob detector based on Laplacian • Computation of descriptors for each point • Similarity of descriptors • Semi-local constraints • Global verification Approach for Matching and Recognition • Detection of interest points/regions • Computation of descriptors for each point – greyvalue patch, diff. invariants, steerable filter, SIFT descriptor • Similarity of descriptors – correlation, Mahalanobis distance, Euclidean distance • Semi-local constraints • Global verification Approach for Matching and Recognition • Detection of interest points/regions • Computation of descriptors for each point • Similarity of descriptors • Semi-local constraints – geometrical or statistical relations between neighborhood points • Global verification – robust estimation of geometry between images Overview 8:30-8:45 Scale invariant interest points 8:45-9:00 SIFT descriptors 9:00-9:25 Affine invariance of interest points + applications 9:25-9:45 Evaluation of interest points + descriptors 9:45-10:15 Break Overview 10:15-11:15 Object recognition system, demo, applications 11:15-11:45 Recognition of textures and object classes 11:45-12:00 Future directions + discussion