Abstract - Personal Home Pages (at UEL)

advertisement
Abstract
This project investigates a stitching algorithm that would make it possible to mosaic a
set of images. These images form a part of the output of portable scanner. The project
involves reduction of noise, extraction of features, identification of matching features
(feature correspondence) and estimating the transformation parameters that describe
the motion in the image. This algorithm uses a local structure matrix based operator to
extract the points of interest or features. For noise reduction the algorithm employs a
smoothing spatial filter. From the extracted features a singular value decomposition
based technique has been applied to identify the corresponding points in the sequence
of images. Corresponding points had to be identified in order to estimate the
transformation parameter from these matching points. This project discusses a
procedure to get the initial estimates of the transformation parameters; whereas
refining these estimates could be done applying the least squares solutions. For this
project a data set library containing various images has been compiled which, to some
extent replicates the possible distortions that might occur in such applications.
Observations made while implementing this algorithm reveal that the methodology
employed in order to extract features and establish a correspondence between those
features work considerably well for images subjected to small distortions. Screen
shots have been provided where applicable to assist in visualizing the implementation
of the algorithm.
Chapter 1: Algorithm Specifications
1.1 Introduction
The main aim of the project was to investigate an algorithm, which would make it
possible to stitch (mosaic) a set of images. The process of stitching is considered to be
complex due its dependency on other issues like noise (chapter 4), feature extraction
(chapter 5), feature correspondence (chapter 6), Warping & Interpolation (chapter 7)
etc. The structures of these images are considered to be transformed under rigid
motion and these images form a part of the output of a portable scanner. The
following chapters shall describe the issues mentioned above as well as a few
additional issues in order to maintain comprehensiveness. The methodologies adopted
in developing this algorithm, are based on suggestions from other researchers and are
referenced where applicable.
1.2 Computation Environment and Algorithm Validation
The implementation of the algorithm has been done using Matlab® (Release-13)
Development Environment. This is because of its features such as the Image
Processing toolbox, data visualisation (graphics) etc., which assist in speeding up the
process of algorithm development. Validation of the algorithm was done using sample
data (images) because it has not been possible to obtain real data at the time of
drafting this dissertation. This sample data is a compilation from various sources such
as the internet, flatbed scanners, and synthetic data created using image editing tools
(e.g. MS Paint etc)
The images under consideration are in the 24-bit, colour format (.bmp).The test
criteria for the algorithm, is visual interpretation i.e. the result of the particular
operations shall be displayed as markers or lines superimposed on the input image.
1.3 Assumptions
1.3.1 Noise
Noise in images can interfere or occur in many forms. Some of these are salt &
pepper noise, random noise etc. Due to the unavailability of original data it has not
been possible to estimate the types of noise which may occur in this particular
application. Hence, in general after considering certain similar applications, and
acquisition methods, [1, 6] it is assumed that random noise is more likely in such
applications and consequently a suitable filter is applied to reduce this form of noise
(chapter 4).
1.3.2 Image Transformations
Considering that the end application or device gives three degrees of freedom, then,
possibly the images acquired could be distorted with, translations, rotations or a
combination of both. This form of acquired data falls into the category similar to a
rigid body under motion, and can be analysed by the theory of kinematics of rigid
bodies [18].
1.3.3 Images under consideration
The acquired data is stored in the memory in a 24-bit map format and hence all
sample images considered for the validation of the algorithm are in the same format
(i.e. 24-bit true colour bit map, .bmp). For this project, the scaling factor has been
considered to be unity (one), and also that the images being processed are similar in
intensities. The maximum number of images being processed is limited to two, in
order to prove the algorithm. This algorithm could be used to process more than two
images with minor changes to the source code.
1.4 Methodology
The time spent in studying some of the IEEE publications, and other documentation
related to this subject, has provided comprehension of the project and the challenges
involved in it. With due consideration to the research done on this topic and the
suggestions from the references, a flow diagram (figure 1) has been proposed which
describes the method which is adopted in implementing the algorithm. This flow
diagram is an illustration of the main tasks involved in the algorithm. A more detailed
description is provided in the later chapters of this document, which explain the
technique and mathematics involved in developing the algorithm. In figure 1, I1 and
I2 are the input images.
1.5 Flow Diagram
NOISE REDUCTION
I1
NOISE REDUCTION
I2
FEATURE
EXTRACTION
FEATURE
EXTRACTION
FEATURE
CORRESPONDENCE
ALIGNMENT
WARPING &
INTERPOLATION
Figure 1, Flow diagram
1.6 Document Structure
This document is organised according to the flow diagram (figure 1) with additional
chapters which emphasize more on the development environment (chapter 3) and the
end application. Chapter 4 discusses some of the type of noise that normally occur in
images and the possibility of them interfering in the current application. Chapter 5
emphasizes the feature extraction part of this algorithm whereas Chapter 6 focuses on
the method of establishing the correspondence among the extracted features. This
chapter also discusses the image transformations and the estimation of these
transformation parameters, considering our initial assumptions of rigid motion.
Warping and Interpolation which result in the final stitched image is discussed briefly
in Chapter 7. Details of the literature referred to, is provided in the References section
of the thesis. At the end of the document, appendices have been provided which may
assist in some of the applied mathematical concepts.
Chapter 8: Conclusion and Future Work
8.1 Conclusion
This project has investigated an algorithm that would make it possible to stitch
(mosaic) a set of images. The proposed algorithm reduces the noise content in the
acquired images (based on the initial assumptions, section 1.3), identifies features
from a given set of images (chapter 5), establishes a correspondence between features
from different images (chapter 6), and gives an initial estimate of the motion
parameters (section 6.2) that assist in the alignment and stitching process. These
processes have been validated using images from the data set library compiled for this
purpose (section 2.2). It is observed that the operator employed to extract features
from the images works considerably well for small rotations i.e. for angles <10o
(Table 1, Figure: 22 – 24, 34) but its performance varies as the angle of rotation
increases i.e. for angles >10o (Table 1, Figure: 26, 28, 29, 30, 33, 35). It has been
observed that the number of features extracted is the maximum when the angle of
rotation is 45o (Table 1). The influence of noise on the feature extraction process has
been observed and illustrated by a bar graph (Figure 21). It was also observed that the
computation time increases as the number of features extracted increases (Table 2).
This is due to the participation of a higher number of image elements in the
correspondence establishment process.
8.2 Future Work
Although a possible algorithm has been proposed, it is required that more validation
be done for each block in Figure 1. Due to time constraints, only a few images have
been considered as test images.
The optimality of this algorithm could be confirmed by extending the validation
procedure over a wide variety of images and also on original scanner output images.
As a part of the future work, experiments can be conducted on additional techniques
relative to each block of the algorithms flow diagram (Figure 1), and have been
discussed in the following sections.
8.2.1 Noise Reduction
Instead of the Gaussian smoothing filter, experiments can be conducted with other
spatial filtering techniques such as the Alpha trimmed mean filters, which have an
advantage of being useful in situations where images contain a combination of salt &
pepper and Gaussian noise (more than one type of noise) [1].
Also, adaptive filtering could be applied to reduce local noise while extracting points
of interests (features).
8.2.2 Feature Extraction
It has been observed that the Harris operator tends to identify more number of features
when images are subject to a rotation of about 45o. In order confirm this experiments
initially were conducted on images that were distorted using image editing tools, but
these tools induced unwanted artefacts such as extra edges, image borders etc. Hence,
to avoid these artefacts documents were scanned using a HP-Precision Pro Flatbed
scanner. Sample documents were scanned, while intentionally disturbing the scanning
procedure. The observations are as shown in Figures 24 – 35 (Chapter 5). Hence,
work can be carried out in this area in order to identify methods that can extract
features from images that are subjected to a higher angle of rotation.
In addition to the above-mentioned extension of the current work on feature
extraction, another area worth experimenting is the derivative operator i.e. for this
project a normal derivative operator has been applied; instead a Sobel operator could
be applied. A possible advantage of applying a Sobel derivative operator could be its
weighted value of 2, which could help in the smoothing operation by giving more
importance to the centre pixel [1]
Normal derivative operator
-1
0
1
-1
-1
-1
-1
0
1
0
0
0
-1
0
1
1
1
1
(a)
(b)
Figure 2, (a), (b) x & y - directional derivative masks, respectively
Sobel operator
-1
0
1
-1
-2
-1
-2
0
2
0
0
0
-1
0
1
1
2
1
(a)
(b)
Figure 3, (a), (b) x & y - directional Sobel masks, respectively
8.2.3 Feature Correspondence
In chapter 6 a methodology to identify corresponding points from a set of features has
been discussed and the implementation procedure presented. This method works fine
to an extent of small rotations (< 10o). This might be due to the fact that some features
that are not in the overlapping region (outliers) also participate in the matching
process. For the current project rogue features or outliers (i.e. the features that do not
correspond to anything) have not been considered, hence work could be carried out in
order to identify and minimize the participation of outliers in feature matching. This
may be useful in optimizing the stitching process.
Literature by Huynth.D.Q. et al as well as Press.W.H et al, provide information about
detecting and minimizing errors induced by these outliers [14, 17].
As an alternate method of identifying possible 1:1 correspondence between the
features extracted from different images, the Random Sample Consensus Algorithm
could be implemented. The RANSAC algorithm could be used for robust fitting of
models in the presence of many outliers [10].
8.2.4 Motion Parameter Estimation
Section 6.2 has described a method by which the initial estimates of the motion
parameters could be achieved. Since, this is only an initial estimate of the motion
parameters; a least squares approximation can be applied to minimize the error in the
estimation. This minimization could be given by equation 8.1 [15]
n
E=
X
i 1
2
'
i
 R Xi t
(8.1)
where,
X', feature in I2
X, feature in I1,
R, rotation matrix and
t, translation vector.
In addition to the above-mentioned extensions to the current project work, frequency
domain techniques could be explored in order to reduce the overall computational
time. Reddy.B.S et al have described an FFT based technique to estimate the motion
parameters which is claimed to be computationally less expensive. Other frequency
domain techniques such as the Discrete Cosine Transforms (DCT), and Wavelet
Transforms could be used to reduce the overall computational cost of the algorithm.
References:
[1] Gonzalez.R.C and Woods.E.R, “Digital Image Processing”, Prentice Hall, Inc.,
Second Edition, ISBN 0-201-18075-8, 2001
[2] Heung-Yeung Shum and Szeliski.R, “Panoramic image mosaics”, Technical
report, MSR-TR-97-23 (updated), 1997
[3] Eric W. Weisstein. Mathworld--A Wolfram Web Resource,
http://mathworld.wolfram.com/AffineTransformation.html
[4] Hartley.R and Zisserman.A, “Multiple view geometry in computer vision”,
Second Edition. Cambridge University Press, ISBN: 0521540518, March 2004
[5] Zappala.T, et al, University of Cambridge, “Document mosaicing”, BMVC97
proceedings, 1997
[6] Harris.C and Stephens.M, “A Combined Corner and Edge Detector”, 1998, Proc.
4th Alvey Vision Conference, Manchester, U.K., pp 147 – 151,
[7] Ramoser.H et al, “Efficient alignment of finger print images”, Pattern
Recognition, Proceedings. 16th, International Conference on, Volume: 3, 11-15 Pages:
748 - 751 Vol.3., Aug.2002
[8] Scott.G and Longuet.H-Higgins., “An algorithm for associating the features of two
patterns”, In Proc. Royal Society London, volume B244, pages 21-26, 1991.
[9] Wallis.J.W and Miller.T.R, “An Optimal Rotator for iterative Reconstruction”,
IEEE Transactions on medical imaging, Vol. 16, No.1, Feb’1997,
[10] Fischler.M.A and Bolles.R.C, “The Random Sample Consensus set: a paradigm
for model fitting with applications to image analysis and automated cartography”,
Communications of the ACM, 24(6); 381-395, 1981.
[11]
Pilu.M,
“Uncalibrated
Stereo
Correspondence
by
Singular
Decomposition”, HP Laboratories Bristol, HPL-97-96, August, 1997.
Value
[12] Denton, J.; Beveridge, J.R.; “Two dimensional projective point matching”, Image
Analysis and Interpretation, 2002. Proceedings. Fifth IEEE Southwest Symposium
on, 7-9 April 2002 Pages: 77 – 81.
[13] Stegmann.M.B, “Image Warping”, Informatics and Mathematical Modeling,
Technical University of Denmark, Richard Petersens Plads, Building 321, DK-2800
Kgs. Lyngby, Denmark, 29th October 2001.
[14] Huynth.D.Q et al, “Outlier Detection in Video sequences under Affine
Projection”, IEEE’2001, Pg. 695 – 701.
[15] Meshoul, S.; Batouche, M.; “A fully automatic method for feature-based image
registration”, Systems, Man and Cybernetics, 2002 IEEE International Conference
on, Volume: 4, 6-9 Oct. 2002Pages:5 pp. vol.4.
[16] Reddy, B.S.; Chatterji, B.N; “An FFT-based technique for translation, rotation,
and scale-invariant image registration”, Image Processing, IEEE Transactions on,
Vol.:5, Issue: 8, Aug’1996, Pg: 1266 – 1271.
[17] Press.W.H et al, “Numerical Recipes in C: The Art of Scientific Computing”,
Cambridge University Press, Second Edition, ISBN 0-521-43108-5, 1992.
[18]
Huang,
T.S.;
Netravali,
A.N,
“Motion
and
structure
from
feature
correspondences: A review”, Proceedings of the IEEE, Volume: 82, Issue: 2,
Feb.1994 Pages: 252 – 268,
[19] Smith.S.W, “The Scientist and Engineer's Guide to Digital Signal Processing”,
Second Edition, California Technical Publishing, ISBN 0-9660176-6-8, 1999.
[20] Rockett.I.P, “Performance Assessment of feature detection algorithms: A
methodology and case study on corner detectors”, IEEE Trans. on Image processing,
Vol.12, No.12, December 2003, Pg. 1668 – 1676.
[21] C.A. Glasbey et al, “A Review of Image warping methods”, Journal of Applied
Statistics, 25, 155-171, 1998.
[22] Matlab Release 13, Documentation, Mathworks Inc., 2002.
[23] Stroud.K.A, “Further Engineering Mathematics”, Palgrave Macmillan, 3rd
edition, ISBN 0333657411, 1996
[24] Chanereley.A, Class handouts, Digital signal processing, School of Computing
and Technology, University of East London,
Download