Brightness and geometric transformations

advertisement
Brightness and geometric transformations
Václav Hlaváč
Czech Technical University in Prague
Center for Machine Perception (bridging groups of the)
Czech Institute of Informatics, Robotics and Cybernetics and
Faculty of Electrical Engineering, Department of Cybernetics
http://people.ciirc.cvut.cz/hlavac, hlavac@ciirc.cvut.cz
Brightness scale transformation.
Brightness corrections.
Outline of the talk:
Geometric transformations.
Image preprocessing, the intro
2/32
The input is an image, the output is an image too.
The image is not interpreted.
To suppress the distortion (e.g., correction of the geometric distortion
caused by spherical shape of the Earth taken from a satellite).
Noise suppression.
The aim
Contrast enhancement (which is useful only if the human looks at the
image).
Enhancement of some image features needed for further image
processing, e.g., edge finding.
Taxonomy of the image preprocessing methods
3/32
Taxonomy according to the neighborhood size of the current pixel.
Neighborhood
Single pixel
Local
Global
Example of the operation
Brightness scale transform
Brightness corrections
Geometric transforms
Local filtration
Fourier transform
Image restauration
Point
input
Actually processed neighborhood
The same for all pixels
The current pixel
A current pixel (theoretically),
a small neighborhood (practically)
A small neighborhood
The whole image
Global
Local
output
input
output
input
output
Brightness histogram
4/32
The brightness histogram is the estimate of the probability distribution of the
phenomenon that a pixel has a certain brightness.
3500
3000
2500
2000
1500
1000
500
0
0
input image
50
100
150
brighness histogram
200
250
Transformation of the gray scale
5/32
g(i) – input gray scale,
f
255
f (i) – output gray scale,
Contrast
enhancement
Threshold
i = 0, . . . , imax (maximal brightness).
Negative
0
255 g
Some transformations are implemented in HW, e.g., in the display card (e.g.
in the VGA mode).
Notes:
Thresholding: converts a gray scale image into a binary image. It yields two
regions. How to chose an appropriate threshold?
Nonlinearity in the intensity, γ correction
Why cameras perform nonlinear brightness
transformation, the γ-correction?
In the cathode-ray tube (vacuum) displays,
brightness ≈ U γ , where U is the grid voltage and
γ is usually 2.2.
The effect is compensated in cameras by the
inverse function to preserve the linear transfer
characteristic in the whole transmission chain.
Having input irradiance E and the input voltage
of the camera U , the inverse function is
U ≈ E (1/γ), usually U ≈ E 0,45.
6/32
Although the brightness of LCD monitors depends
on the input voltage linearly, the γ-correction is
still in use to secure the backward compatibility.
Example of two thresholds
7/32
3500
3000
2500
2000
1500
1000
500
0
0
50
100
150
200
250
Histogram equalization
8/32
To enhance the contrast for a human observer by utilizing the gray scale
fully;
The aim is:
To normalize the brightness of the image for the automatic brightness
comparison.
3500
3500
3000
3000
2500
2500
2000
2000
1500
1500
1000
1000
500
500
0
0
0
50
100
150
200
histogram of the input image
250
0
50
100
150
200
histogram after equalization
250
Increased contrast
after the histogram equalization
original image
increased contrast
9/32
Derivation of the histogram equalization
10/32
Input: The histogram H(p) of the input image with the gray scale p = hp0, pk i.
Aim: to find a monotonic transform of the gray scale q = T (p), for which the
output histogram G(q) will be uniform for the entire input brightness domain
q = hq0, qk i.
k
k
X
X
G(qi) =
H(pi) .
i=0
i=0
The equalized histogram ≈ the uniform distribution.
N2
f=
.
qk − q0
Derivation of the histogram equalization (2)
11/32
The ideal continuous histogram is available only in the continuous case.
Z
q
Z
p
H(s) ds .
G(s) ds =
q0
N2
Z
q
q0
1
ds =
qk − q0
p0
Z
p
H(s) ds .
p0
Z p
N 2(q − q0)
H(s) ds .
=
qk − q0
p0
Z p
qk − q0
q = T (p) =
H(s) ds + q0 .
N2
p0
Discrete case, cumulative histogram
p
X
qk − q0
q = T (p) =
H(i) + q0 .
2
N
i=p
0
Pseudocolor
System (Windows)
Spectrum
Pseudocolor displays a
grayscale image as a color
image by mapping each
intensity value to a color
according to a lookup
table or a function.
Unchanged
The mapping table
(function) is called the
palette.
12/32
Pseudocolor is often used
to display a single channel
data as temperature,
distance, etc.
Black body
Brightness corrections
The new brightness f (i, j) depends on the position in the input image i, j
of the input image g(i, j).
13/32
The multiplicative model of the disturbance is often considered:
f (i, j) = e(i, j) g(i, j).
Two approaches:
1. The corrective coefficients are obtained by capturing some etalon area with
the known brightness c. This is used to compensate the irregular
illumination. (The AGC in the camera has to be switched off when capturing
the etalon area). The captured image is
fc(i, j) = e(i, j) c
⇒
fc(i, j)
e(i, j) =
c
2. The approximation of the background by some analytic surface and its
subtraction from the original image.
Geometric transform, a motivation in 2D
14/32
y
y’
T(x,y)
The vector transform T can be split
into two components
x’
x
What are the conditions ensuring a one-to-one mapping?
x0 = Tx(x, y), y 0 = Ty (x, y).
Geometric transforms are used for
• Size change, shift, rotation, slanting according to the transformation
known in advance.
• Undoing geometric distortions. The geometric distortions are often
given by examples, images.
Similar techniques are used in theoretical mechanics, robotics, and computer
graphics.
Topographic view of the digital intensity image
15/32
Recall that we
considered the image
f (x, y) as a landscape
in the introduction
lecture.
y
8
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8
x
The image intensity
(brightness) f (x, y)
corresponds to the
height at the position
x, y.
A digital image provides
only samples on a
discrete grid illustrated
as the orange arrows.
f(4,5)
8
7
6
5
4
3
y
2
1
1
x
2
3
4
5
6
7
8
Topographic view, the geom. transformed image
16/32
The digital image provides
samples in the discrete grid
only. See the top right picture.
Consider the example: The
image is shifted half a pixel in
the x-axis direction. See the
bottom right picture.
f(4,5)
8
7
6
5
4
3
y
2
1
1
x
2
3
4
5
8
7
6
The issue: There is no image
function f (x + 0.5, y) value
available.
The solution: Interpolate the
missing value from the
neighboring image function
values. Q: Why neighboring
values only?
?
8
7
6
5
4
3
y
2
1
1
x
2
3
4
5
6
7
8
Artistic rendering; maybe of a sampled image
Photos are from the exhibition Design Block in Prague in October 2012.
17/32
The artistic piece of work illustrates our topographic view at the image
function f (x, y), which can be represented by a collection of poles.
Two needed steps due to the discrete raster
18/32
1. Transformation of pixel coordinates provides new positions of a pixel, which
is calculated in continuous coordinates (real numbers) because the outcome
can be off the original raster.
Part of the new transformed image lies off the image.
Issues:
Transformations are not invertible because some approximations were
needed.
2. Approximation of the brightness function seeks the integer brightness value
for the integer coordinate, which matches best the newly calculated position
x0, y 0 given as real numbers.
Transformation of pixel coordinates
19/32
The geometric transformation x0 = Tx(x, y), y 0 = Ty (x, y) is often
approximated by a polynomial of m-th order.
x0 =
m m−r
X
X
brk xr y k .
The formula is linear with respect to the coefficients ark , brk .
r=0 k=0
The unknown coefficients can be obtained by solving a system of linear
equations, in which pairs of corresponding control points (x, y) and (x0, y 0)
are known.
y0 =
If the system of linear equations were solved deterministically, only a few
pairs of control points corresponding to the number of unknown parameters
would suffice.
r=0 k=0
ark xr y k ,
m m−r
X
X
However, very many control point pairs are used in practice, which leads to
an overdetermined system of linear equations. The unknown coefficients ark ,
brk are estimated using the least-squares method.
A basic set of 2D planar transformations
20/32
y
Euclidean
translation
projective
affine
original
similarity
Inspiration R. Szeliski.
x
Bilinear, affine transformation of coordinates
21/32
If the geometric transformation changes slowly with respect to the position then
approximating polynomials of lower order m = 2 or m = 3 suffice.
Bilinear transformation
x0 = a0 + a1x + a2y + a3xy ,
y 0 = b0 + b1x + b2y + b3xy .
Affine transformation, which is even more special, comprises practically
needed rotation, translation and slant.
x0 = a0 + a1x + a2y ,
y 0 = b0 + b1x + b2y .
Homogeneous coordinates ⇒ matrix notation
The key idea is to represent the point in the space of the dimension
increased by one.
Homogeneous coordinates are common in theoretical mechanics, projective
geometry, computer graphics, robotics, etc.
22/32
The point [x, y]> is expressed in the 3D vector space as [λx, λy, λ]>, where
λ 6= 0.
To simplify expressions, often only one of the infinitely many forms is used
[x, y, 1]>.
Affine transformation expressed in a matrix form
23/32
The affine mapping is expressed after introducing homogeneous coordinates as
 

 
x0
a1 a2 a0
x
 0  
 
 y  =  b1 b2 b0   y  .
1
0 0 1
1
Relation to the PostScript language.
Notes:
The complicated geometric transforms (deformations) can be approximated
by tiling the image into smaller rectangular areas. The simpler transform is
used to express distortions in each tile, maybe estimated from the control
points.
Applying the geometric transformation
Dual expression
The input image has to be mapped
by the transform T to the output
image.
Two dual approaches:
• Forward mapping:
(x0, y 0) = T (x, y).
• Backward mapping:
(x, y) = T−1 (x0, y 0).
T
The difference between the two
expressions is due to the needed
brightness interpolation.
-1
T
24/32
Forward, backward mapping – comparison
25/32
Some output pixels might not
get a value assigned. Gaps occur.
For each pixel in the output
image (x0, y 0), the position in the
input image (x, y), where the
gray value should be taken from
is computed using T−1.
Two or more input pixels can be
mapped to the same output pixel.
Backward mapping
No gaps will appear.
Output coordinates (x0, y 0) can
lie off the raster.
Forward mapping
Problem: T−1 might not always
exist.
Image function approximation
The principally correct answer provides the approximation theory. The
continuous 2D function is estimated from available samples.
26/32
Often, the estimated 2D function is approximated locally by a 2D
polynomial.
Q: Why is the approximation by polymials popular?
A1: A polynomial has a few parametrs only. The best values of the
parameters are sought using an optimization method. Polynomials have
continuous derivatives. Consequently, the optimum can be derived
analytically or can be computed numerically using the gradient method.
A2: In the case that the approximation polynomial parameters are estimated
experimentally, there are usually many more examples than parameters.
Having in mind A1, the needed derivatives can be expressed analytically.
Approximation as 2D convolution
Instead of the originally continuous image function f (x, y), its sampled
variant fs(l, k) is available.
27/32
The outcome of the approximation (interpolation) is the brightness fn(x, y),
where index n denotes individual interpolation methods. The brightness can
be expressed as a 2D convolution
fn(x, y) =
∞
X
∞
X
fs(l, k) hn(x − l, y − k) .
Function hn is the interpolation kernel.
l=−∞ k=−∞
Often, the local approximation kernel is used, which covers only a small
neighborhood of the current pixel, to save computations. The values of the
kernel hn = 0 outside the kernel domain.
The nearest neighbor approximation
28/32
The point (x, y) is assigned the value of the nearest point gs in the discrete
raster.
h1(x, y) = gs(round (x), round (y)) .
h1
-0.5
0
0.5
x
Linear interpolation
29/32
Explores 4 points in the neighborhood of (x, y) and combines them linearly. The
influence of each point in the linear combination depends on the proximity to the
current point.
f2(x, y) = (1 − a)(1 − b) gs(l, k) + a(1 − b) gs(l + 1, k)
+b(1 − a) gs(l, k + 1) + ab gs(l + 1, k + 1) ,
kde
l = ceil (x) , a = x − l ,
k = ceil (y) , b = y − k .
h
2
-1
0
1
x
Bilinear interpolation
30/32
Bicubic interpolation
31/32
Interpolates locally the image
function by the bicubic polynomial
from sixteen points in the
neighborhood.
0.8
0.6
0.4
1D interpolation kernel (for
simplicity).
0.2
-2.0
2.0
-0.1

pro 0 ≤ |x| < 1 ,
 1 − 2|x|2 + |x|3
4 − 8|x| + 5|x|2 − |x|3 pro 1 ≤ |x| < 2 ,
h3 =

0
jinde.
Image enlarding, three methods comparison
32/32
Courtesy: Richard Alan Peters II, Vanderbild University
Download