From Pixels to Features: Review of Part 1 COMP 4900C Winter 2008 Topics in part 1 – from pixels to features • Introduction • what is computer vision? It’s applications. • Linear Algebra • vector, matrix, points, linear transformation, eigenvalue, eigenvector, least square methods, singular value decomposition. • Image Formation • camera lens, pinhole camera, perspective projection. • Camera Model • coordinate transformation, homogeneous coordinate, intrinsic and extrinsic parameters, projection matrix. • Image Processing • noise, convolution, filters (average, Gaussian, median). • Image Features • image derivatives, edge, corner, line (Hough transform), ellipse. General Methods • Mathematical formulation • Camera model, noise model • Treat images as functions I f ( x, y) • Model intensity changes as derivatives f [I x , I y ]T • Approximate derivative with finite difference. • First-order approximation I (i u, j v) I (i, j) I xu I y v I (i, j) u vf • Parameter fitting – solving an optimization problem Vectors and Points We use vectors to represent points in 2 or 3 dimensions y v Q(x2,y2) P(x1,y1) x x2 x1 vQP y y 1 2 The distance between the two points: D Q P ( x2 x1 ) 2 ( y2 y1 ) 2 Homogeneous Coordinates Go one dimensional higher: wx x y wy w wx x y wy wz z w w is an arbitrary non-zero scalar, usually we choose 1. From homogeneous coordinates to Cartesian coordinates: x1 x x1 / x3 2 x2 / x3 x3 x1 x x1 / x4 2 x2 / x4 x3 x x3 / x4 4 2D Transformation with Homogeneous Coordinates 2D coordinate transformation: cos p' ' sin sin p x Tx cos p y Ty 2D coordinate transformation using homogeneous coordinates: p x ' ' cos p ' ' sin y 1 0 sin Tx p x cos Ty p y 0 1 1 Eigenvalue and Eigenvector We say that x is an eigenvector of a square matrix A if Ax x is called eigenvalue and x is called eigenvector. The transformation defined by A changes only the magnitude of the vector x Example: 3 2 2 4 2 3 2 1 5 1 1 4 1 5 51 and 1 4 1 2 2 1 1 2 5 and 2 are eigenvalues, and 1 and 1 are eigenvectors. Symmetric Matrix We say matrix A is symmetric if AT A Example: B T B is symmetric for any B, because ( BT B)T BT ( BT )T BT B A symmetric matrix has to be a square matrix Properties of symmetric matrix: •has real eignvalues; •eigenvectors can be chosen to be orthonormal. • BT B has positive eigenvalues. Orthogonal Matrix A matrix A is orthogonal if A A I T or A A T 1 The columns of A are orthogonal to each other. Example: cos A sin sin cos cos A sin 1 sin cos Least Square When m>n for an m-by-n matrix A, Ax b has no solution. In this case, we look for an approximate solution. We look for vector x such that Ax b 2 is as small as possible. This is the least square solution. Least Square Least square solution of linear system of equations Ax b Normal equation: T A A A Ax A b T is square and symmetric The Least square solution makes T Ax b 2 1 x ( A A) A b minimal. T T SVD: Singular Value Decomposition An mn matrix A can be decomposed into: A UDV T U is mm, V is nn, both of them have orthogonal columns: U U I T V V I T D is an mn diagonal matrix. Example: 2 0 1 0 0 2 0 0 3 0 1 0 0 3 1 0 0 1 0 0 0 0 1 0 0 Pinhole Camera Why Lenses? Gather more light from each scene point Four Coordinate Frames xim Yc camera frame optical center Camera model: y pim pixel frame image plane frame Yw Zw yim Xc Zc Pw world frame x principal point transformation pim Pw matrix Xw Perspective Projection P X Y x f y f Z Z p y optical center x principal point These are nonlinear. principal axis image plane Using homogenous coordinate, we have a linear relation: u f v 0 w 0 xu/w 0 f 0 X 0 0 Y 0 0 Z 1 0 1 y v/w World to Camera Coordinate Transformation between the camera and world coordinates: Xc RXw T R,T Xc Y c R Z c 0 1 X w T Yw 1 Z w 1 Image Coordinates to Pixel Coordinates x (ox xim ) sx y (o y yim ) s y s x , s y : pixel sizes xim yim y x (ox,oy) xim 1 / s x y 0 im 1 0 0 1/ s y 0 ox x oy y 1 1 Put All Together – World to Pixel x1 1 / s x x 0 2 x3 0 1 / s x 0 0 1 / s x 0 0 0 1/ sy 0 0 1/ sy 0 0 1/ sy 0 f / s x 0 0 xim x1 / x3 0 f / sy 0 ox u o y v 1 w ox f o y 0 1 0 0 f 0 X 0 0 c Y 0 0 c Zc 1 0 1 X w 0 0 R T Yw f 0 0 0 1 Z w 0 1 0 1 X w ox 1 0 0 0 R T Yw o y 0 1 0 0 Z K R 0 1 w 1 0 0 1 0 1 ox f o y 0 1 0 0 yim x2 / x3 Xw Y T w Zw 1 Camera Intrinsic Parameters f / s x K 0 0 0 f / sy 0 ox oy 1 K is a 3x3 upper triangular matrix, called the Camera Calibration Matrix. There are five intrinsic parameters: (a) The pixel sizes in x and y directions s x , s y (b) The focal length f (c) The principal point (ox,oy), which is the point where the optic axis intersects the image plane. Extrinsic Parameters Xw X w x1 Y Y pim x2 K R T w M w Zw Zw x3 1 1 [R|T] defines the extrinsic parameters. The 3x4 matrix M = K[R|T] is called the projection matrix. Image Noise Additive and random noise: Iˆx, y I x, y nx, y I(x,y) : the true pixel values n(x,y) : the (random) noise at pixel (x,y) Gaussian Distribution Single variable 1 ( x ) 2 / 2 2 p( x) e 2 Gaussian Distribution 2 Bivariate with zero-means and variance x2 y2 G x, y exp 2 2 2 2 1 Gaussian Noise Is used to model additive random noise n2 2 2 •The probability of n(x,y) is e •Each has zero mean •The noise at each pixel is independent Impulsive Noise • Alters random pixels • Makes their values very different from the true ones Salt-and-Pepper Noise: • Is used to model impulsive noise I h, k xl I sp h, k imin yimax imin x l x, y are uniformly distributed random variables l , imin,imaxare constants Image Filtering Modifying the pixels in an image based on some function of a local neighbourhood of the pixels N(p) p 10 30 10 20 11 20 11 9 1 f p 5.7 Linear Filtering – convolution The output is the linear combination of the neighbourhood pixels I A (i, j ) I * A m/2 m/2 A(h, k ) I (i h, j k ) h m / 2 k m / 2 The coefficients come from a constant matrix A, called kernel. This process, denoted by ‘*’, is called (discrete) convolution. 1 3 0 2 10 2 4 1 1 Image 1 0 1 0.1 -1 1 0 Kernel -1 = 5 -1 Filter Output Smoothing by Averaging 1 * 9 1 1 1 1 1 1 1 1 1 Convolution can be understood as weighted averaging. Gaussian Filter x2 y2 G x, y exp 2 2 2 2 1 Discrete Gaussian kernel: G (h, k ) 1 2 2 e h2 k 2 2 2 where Gh, k is an elementof an m m array Gaussian Filter * 1 Gaussian Kernel is Separable IG I G m/2 m/2 G (h, k ) I (i h, j k ) h m / 2 k m / 2 m/2 m/2 e e e e k2 2 2 I (i h, j k ) k m / 2 h2 k 2 2 I (i h, j k ) 2 2 h m / 2 k m / 2 2 m/2 h m/2 2 2 h m / 2 since h2 k 2 2 e h2 2 2 e k2 2 2 Gaussian Kernel is Separable Convolving rows and then columns with a 1-D Gaussian kernel. I 1 38 9 18 9 1 1 = Ir 1 Ir 1 38 9 18 9 = result 1 2 m m The complexity increases linearly with instead of with . Gaussian vs. Average Gaussian Smoothing Smoothing by Averaging Nonlinear Filtering – median filter Replace each pixel value I(i, j) with the median of the values found in a local neighbourhood of (i, j). Median Filter Salt-and-pepper noise After median filtering Edges in Images Definition of edges • • Edges are significant local changes of intensity in an image. Edges typically occur on the boundary between two different regions in an image. Images as Functions 2-D Red channel intensity I f ( x, y) Finite Difference – 2D Continuous function: f x, y f x h, y f x, y lim h 0 x h f x, y f x, y h f x, y lim h 0 y h Discrete approximation: Ix f x, y f i 1, j f i , j x f x, y Iy f i , j 1 f i , j y Convolution kernels: 1 1 1 1 Image Derivatives I x I *1 1 Image I 1 Iy I * 1 Edge Detection using Derivatives 1-D image f (x) 1st derivative f ' ( x) f ' ( x) threshold Pixels that pass the threshold are edge pixels Image Gradient gradient fx f f y magnitude f f 2 x direction arctan(fy / fx ) f 2 y Finite Difference for Gradient Discrete approximation: I x (i, j ) Convolution kernels: f f i 1, j f i , j x 1 1 1 1 f I y (i, j ) f i , j 1 f i , j y magnitude G(i, j ) I (i, j ) I (i, j ) 2 x aprox. magnitude direction 2 y G (i, j ) I x I y arctan(I y / I x ) Edge Detection Using the Gradient Properties of the gradient: • The magnitude of gradient provides information about the strength of the edge • The direction of gradient is always perpendicular to the direction of the edge Main idea: • Compute derivatives in x and y directions • Find gradient magnitude • Threshold gradient magnitude Edge Detection Algorithm 1 1 * Ix edges I x2 I y2 Threshold Image I * 1 1 Iy Edge Detection Example Ix I Iy Edge Detection Example G(i, j ) I x2 (i, j ) I y2 (i, j ) I G(i, j ) Threshold Finite differences responding to noise Increasing noise -> (this is zero mean additive gaussian noise) Solution: smooth first Where is the edge? Look for peaks in Sobel Edge Detector Approximate derivatives with central difference f I x (i, j ) f i 1, j f i 1, j x Smoothing by adding 3 column neighbouring differences and give more weight to the middle one Convolution kernel for I y Convolution kernel 1 0 1 1 0 1 2 0 2 1 0 1 2 1 1 0 0 0 1 2 1 Sobel Operator Example a1 a4 a7 a2 a5 a8 a3 a6 a9 a1 a4 a7 a2 a5 a8 a3 a6 a9 * 1 0 1 2 0 2 1 0 1 * 2 1 1 0 0 0 1 2 1 The approximate gradient at a5 I x (a1 a3 ) 2(a4 a6 ) (a7 a9 ) I y (a1 a7 ) 2(a2 a8 ) (a3 a9 ) Sobel Edge Detector * 1 0 1 2 0 2 1 0 1 Ix I x2 I y2 edges Threshold Image I * 2 1 1 0 0 0 1 2 1 Iy Edge Detection Summary Input: an image I and a threshold . 1. Noise smoothing: I s I h (e.g. h is a Gaussian kernel) 2. Compute two gradient images I x and I y by convolving I s with gradient kernels (e.g. Sobel operator). 3. Estimate the gradient magnitude at each pixel G(i, j ) I x2 (i, j ) I y2 (i, j ) 4. Mark as edges all pixels (i, j ) such that G(i, j ) Corner Feature Corners are image locations that have large intensity changes in more than one directions. Shifting a window in any direction should give a large change in intensity Harris Detector: Basic Idea “flat” region: no change in all directions “edge”: no change along the edge direction “corner”: significant change in all directions C.Harris, M.Stephens. “A Combined Corner and Edge Detector”. 1988 Change of Intensity The intensity change along some direction can be quantified by sum-of-squared-difference (SSD). D(u, v) I (i u, j v) I (i, j ) 2 i, j u v I (i, j ) I (i u, j v) Change Approximation If u and v are small, by Taylor theorem: I (i u, j v) I (i, j) I xu I y v where I Ix x I and I y y therefore I (i u, j v) I (i, j ) 2 I (i, j ) I x u I y v I (i, j ) 2 I x u I y v 2 I x2u 2 2 I x I y uv I y2 v 2 I x2 u v I x I y I x I y u 2 I y v Gradient Variation Matrix I x2 D(u, v) u v I x I y I I I x y 2 y u v This is a function of ellipse. I x2 C I x I y I I I x y 2 y Matrix C characterizes how intensity changes in a certain direction. Eigenvalue Analysis I x2 C I x I y I I I x y 2 y 0 T 1 Q Q 0 2 If either λ is close to 0, then this is not a corner, so look for locations where both are large. I C Ix I y x I x C I x I y AT A I y •C is symmetric •C has two positive eigenvalues Iy (max)-1/2 (min)-1/2 Corner Detection Algorithm Line Detection The problem: •How many lines? •Find the lines. Equations for Lines The slope-intercept equation of line y ax b What happens when the line is vertical? The slope a goes to infinity. A better representation – the polar representation x cos y sin Hough Transform: line-parameter mapping A line in the plane maps to a point in the - space. (,) All lines passing through a point map to a sinusoidal curve in the - (parameter) space. x cos y sin Mapping of points on a line Points on the same line define curves in the parameter space that pass through a single point. Main idea: transform edge points in x-y plane to curves in the parameter space. Then find the points in the parameter space that has many curves passing through. Quantize Parameter Space m Detecting Lines by finding maxima / clustering in parameter space. m Examples input image Hough space lines detected Image credit: NASA Dryden Research Aircraft Photo Archive Algorithm 1. Quantize the parameter space int P[0, max][0, max]; // accumulators 2. For each edge point (x, y) { For ( = 0; <= max; = +) { x cos y sin // round off to integer (P[][])++; } } 3. Find the peaks in P[][]. Equations of Ellipse x2 y 2 2 1 2 r1 r2 ax bxy cy dx ey f 0 2 2 Let x [ x2 , xy, y 2 , x, y,1]T a [a, b, c, d , e, f ] T T x a0 Then Ellipse Fitting: Problem Statement pi [ xi , yi ]T Given a set of N image points find the parameter vector a 0such that the ellipse f (p, a) xT a 0 fits pi best in the least square sense: N min D(p i , a) a 2 i 1 Where D(pi , a) is the distance from pi to the ellipse. Euclidean Distance Fit pi pˆ i D(pi , a) pˆ i pi pˆ i is the point on the ellipse that is nearest to pi f (pˆ i , a) 0 pˆ i pi is normal to the ellipse at pˆ i Compute Distance Function Computing the distance function is a constrained optimization problem: min pˆ i pi pˆ i 2 subject to f (pˆ i , a) 0 Using Lagrange multiplier, define: L( x, y, ) pˆ i p i 2 2f (pˆ i , a) where pˆ i [ x, y]T L ( x, y , ) Then the problem becomes: min ˆ pi Set L L 0 x y we have pˆ i pi f (pˆ i , a) Ellipse Fitting with Euclidean Distance Given a set of N image points pi [ xi , yi ] find the parameter vector a 0such that T N min a i 1 f (p i , a) 2 f (p i , a) 2 This problem can be solved by using a numerical nonlinear optimization system.