Brightness and geometric transformations Václav Hlaváč Czech Technical University in Prague Center for Machine Perception (bridging groups of the) Czech Institute of Informatics, Robotics and Cybernetics and Faculty of Electrical Engineering, Department of Cybernetics http://people.ciirc.cvut.cz/hlavac, hlavac@ciirc.cvut.cz Brightness scale transformation. Brightness corrections. Outline of the talk: Geometric transformations. Image preprocessing, the intro 2/32 The input is an image, the output is an image too. The image is not interpreted. To suppress the distortion (e.g., correction of the geometric distortion caused by spherical shape of the Earth taken from a satellite). Noise suppression. The aim Contrast enhancement (which is useful only if the human looks at the image). Enhancement of some image features needed for further image processing, e.g., edge finding. Taxonomy of the image preprocessing methods 3/32 Taxonomy according to the neighborhood size of the current pixel. Neighborhood Single pixel Local Global Example of the operation Brightness scale transform Brightness corrections Geometric transforms Local filtration Fourier transform Image restauration Point input Actually processed neighborhood The same for all pixels The current pixel A current pixel (theoretically), a small neighborhood (practically) A small neighborhood The whole image Global Local output input output input output Brightness histogram 4/32 The brightness histogram is the estimate of the probability distribution of the phenomenon that a pixel has a certain brightness. 3500 3000 2500 2000 1500 1000 500 0 0 input image 50 100 150 brighness histogram 200 250 Transformation of the gray scale 5/32 g(i) – input gray scale, f 255 f (i) – output gray scale, Contrast enhancement Threshold i = 0, . . . , imax (maximal brightness). Negative 0 255 g Some transformations are implemented in HW, e.g., in the display card (e.g. in the VGA mode). Notes: Thresholding: converts a gray scale image into a binary image. It yields two regions. How to chose an appropriate threshold? Nonlinearity in the intensity, γ correction Why cameras perform nonlinear brightness transformation, the γ-correction? In the cathode-ray tube (vacuum) displays, brightness ≈ U γ , where U is the grid voltage and γ is usually 2.2. The effect is compensated in cameras by the inverse function to preserve the linear transfer characteristic in the whole transmission chain. Having input irradiance E and the input voltage of the camera U , the inverse function is U ≈ E (1/γ), usually U ≈ E 0,45. 6/32 Although the brightness of LCD monitors depends on the input voltage linearly, the γ-correction is still in use to secure the backward compatibility. Example of two thresholds 7/32 3500 3000 2500 2000 1500 1000 500 0 0 50 100 150 200 250 Histogram equalization 8/32 To enhance the contrast for a human observer by utilizing the gray scale fully; The aim is: To normalize the brightness of the image for the automatic brightness comparison. 3500 3500 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 0 50 100 150 200 histogram of the input image 250 0 50 100 150 200 histogram after equalization 250 Increased contrast after the histogram equalization original image increased contrast 9/32 Derivation of the histogram equalization 10/32 Input: The histogram H(p) of the input image with the gray scale p = hp0, pk i. Aim: to find a monotonic transform of the gray scale q = T (p), for which the output histogram G(q) will be uniform for the entire input brightness domain q = hq0, qk i. k k X X G(qi) = H(pi) . i=0 i=0 The equalized histogram ≈ the uniform distribution. N2 f= . qk − q0 Derivation of the histogram equalization (2) 11/32 The ideal continuous histogram is available only in the continuous case. Z q Z p H(s) ds . G(s) ds = q0 N2 Z q q0 1 ds = qk − q0 p0 Z p H(s) ds . p0 Z p N 2(q − q0) H(s) ds . = qk − q0 p0 Z p qk − q0 q = T (p) = H(s) ds + q0 . N2 p0 Discrete case, cumulative histogram p X qk − q0 q = T (p) = H(i) + q0 . 2 N i=p 0 Pseudocolor System (Windows) Spectrum Pseudocolor displays a grayscale image as a color image by mapping each intensity value to a color according to a lookup table or a function. Unchanged The mapping table (function) is called the palette. 12/32 Pseudocolor is often used to display a single channel data as temperature, distance, etc. Black body Brightness corrections The new brightness f (i, j) depends on the position in the input image i, j of the input image g(i, j). 13/32 The multiplicative model of the disturbance is often considered: f (i, j) = e(i, j) g(i, j). Two approaches: 1. The corrective coefficients are obtained by capturing some etalon area with the known brightness c. This is used to compensate the irregular illumination. (The AGC in the camera has to be switched off when capturing the etalon area). The captured image is fc(i, j) = e(i, j) c ⇒ fc(i, j) e(i, j) = c 2. The approximation of the background by some analytic surface and its subtraction from the original image. Geometric transform, a motivation in 2D 14/32 y y’ T(x,y) The vector transform T can be split into two components x’ x What are the conditions ensuring a one-to-one mapping? x0 = Tx(x, y), y 0 = Ty (x, y). Geometric transforms are used for • Size change, shift, rotation, slanting according to the transformation known in advance. • Undoing geometric distortions. The geometric distortions are often given by examples, images. Similar techniques are used in theoretical mechanics, robotics, and computer graphics. Topographic view of the digital intensity image 15/32 Recall that we considered the image f (x, y) as a landscape in the introduction lecture. y 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 x The image intensity (brightness) f (x, y) corresponds to the height at the position x, y. A digital image provides only samples on a discrete grid illustrated as the orange arrows. f(4,5) 8 7 6 5 4 3 y 2 1 1 x 2 3 4 5 6 7 8 Topographic view, the geom. transformed image 16/32 The digital image provides samples in the discrete grid only. See the top right picture. Consider the example: The image is shifted half a pixel in the x-axis direction. See the bottom right picture. f(4,5) 8 7 6 5 4 3 y 2 1 1 x 2 3 4 5 8 7 6 The issue: There is no image function f (x + 0.5, y) value available. The solution: Interpolate the missing value from the neighboring image function values. Q: Why neighboring values only? ? 8 7 6 5 4 3 y 2 1 1 x 2 3 4 5 6 7 8 Artistic rendering; maybe of a sampled image Photos are from the exhibition Design Block in Prague in October 2012. 17/32 The artistic piece of work illustrates our topographic view at the image function f (x, y), which can be represented by a collection of poles. Two needed steps due to the discrete raster 18/32 1. Transformation of pixel coordinates provides new positions of a pixel, which is calculated in continuous coordinates (real numbers) because the outcome can be off the original raster. Part of the new transformed image lies off the image. Issues: Transformations are not invertible because some approximations were needed. 2. Approximation of the brightness function seeks the integer brightness value for the integer coordinate, which matches best the newly calculated position x0, y 0 given as real numbers. Transformation of pixel coordinates 19/32 The geometric transformation x0 = Tx(x, y), y 0 = Ty (x, y) is often approximated by a polynomial of m-th order. x0 = m m−r X X brk xr y k . The formula is linear with respect to the coefficients ark , brk . r=0 k=0 The unknown coefficients can be obtained by solving a system of linear equations, in which pairs of corresponding control points (x, y) and (x0, y 0) are known. y0 = If the system of linear equations were solved deterministically, only a few pairs of control points corresponding to the number of unknown parameters would suffice. r=0 k=0 ark xr y k , m m−r X X However, very many control point pairs are used in practice, which leads to an overdetermined system of linear equations. The unknown coefficients ark , brk are estimated using the least-squares method. A basic set of 2D planar transformations 20/32 y Euclidean translation projective affine original similarity Inspiration R. Szeliski. x Bilinear, affine transformation of coordinates 21/32 If the geometric transformation changes slowly with respect to the position then approximating polynomials of lower order m = 2 or m = 3 suffice. Bilinear transformation x0 = a0 + a1x + a2y + a3xy , y 0 = b0 + b1x + b2y + b3xy . Affine transformation, which is even more special, comprises practically needed rotation, translation and slant. x0 = a0 + a1x + a2y , y 0 = b0 + b1x + b2y . Homogeneous coordinates ⇒ matrix notation The key idea is to represent the point in the space of the dimension increased by one. Homogeneous coordinates are common in theoretical mechanics, projective geometry, computer graphics, robotics, etc. 22/32 The point [x, y]> is expressed in the 3D vector space as [λx, λy, λ]>, where λ 6= 0. To simplify expressions, often only one of the infinitely many forms is used [x, y, 1]>. Affine transformation expressed in a matrix form 23/32 The affine mapping is expressed after introducing homogeneous coordinates as x0 a1 a2 a0 x 0 y = b1 b2 b0 y . 1 0 0 1 1 Relation to the PostScript language. Notes: The complicated geometric transforms (deformations) can be approximated by tiling the image into smaller rectangular areas. The simpler transform is used to express distortions in each tile, maybe estimated from the control points. Applying the geometric transformation Dual expression The input image has to be mapped by the transform T to the output image. Two dual approaches: • Forward mapping: (x0, y 0) = T (x, y). • Backward mapping: (x, y) = T−1 (x0, y 0). T The difference between the two expressions is due to the needed brightness interpolation. -1 T 24/32 Forward, backward mapping – comparison 25/32 Some output pixels might not get a value assigned. Gaps occur. For each pixel in the output image (x0, y 0), the position in the input image (x, y), where the gray value should be taken from is computed using T−1. Two or more input pixels can be mapped to the same output pixel. Backward mapping No gaps will appear. Output coordinates (x0, y 0) can lie off the raster. Forward mapping Problem: T−1 might not always exist. Image function approximation The principally correct answer provides the approximation theory. The continuous 2D function is estimated from available samples. 26/32 Often, the estimated 2D function is approximated locally by a 2D polynomial. Q: Why is the approximation by polymials popular? A1: A polynomial has a few parametrs only. The best values of the parameters are sought using an optimization method. Polynomials have continuous derivatives. Consequently, the optimum can be derived analytically or can be computed numerically using the gradient method. A2: In the case that the approximation polynomial parameters are estimated experimentally, there are usually many more examples than parameters. Having in mind A1, the needed derivatives can be expressed analytically. Approximation as 2D convolution Instead of the originally continuous image function f (x, y), its sampled variant fs(l, k) is available. 27/32 The outcome of the approximation (interpolation) is the brightness fn(x, y), where index n denotes individual interpolation methods. The brightness can be expressed as a 2D convolution fn(x, y) = ∞ X ∞ X fs(l, k) hn(x − l, y − k) . Function hn is the interpolation kernel. l=−∞ k=−∞ Often, the local approximation kernel is used, which covers only a small neighborhood of the current pixel, to save computations. The values of the kernel hn = 0 outside the kernel domain. The nearest neighbor approximation 28/32 The point (x, y) is assigned the value of the nearest point gs in the discrete raster. h1(x, y) = gs(round (x), round (y)) . h1 -0.5 0 0.5 x Linear interpolation 29/32 Explores 4 points in the neighborhood of (x, y) and combines them linearly. The influence of each point in the linear combination depends on the proximity to the current point. f2(x, y) = (1 − a)(1 − b) gs(l, k) + a(1 − b) gs(l + 1, k) +b(1 − a) gs(l, k + 1) + ab gs(l + 1, k + 1) , kde l = ceil (x) , a = x − l , k = ceil (y) , b = y − k . h 2 -1 0 1 x Bilinear interpolation 30/32 Bicubic interpolation 31/32 Interpolates locally the image function by the bicubic polynomial from sixteen points in the neighborhood. 0.8 0.6 0.4 1D interpolation kernel (for simplicity). 0.2 -2.0 2.0 -0.1 pro 0 ≤ |x| < 1 , 1 − 2|x|2 + |x|3 4 − 8|x| + 5|x|2 − |x|3 pro 1 ≤ |x| < 2 , h3 = 0 jinde. Image enlarding, three methods comparison 32/32 Courtesy: Richard Alan Peters II, Vanderbild University