A wavelet-based coarse-to-fine image matching scheme in a

advertisement
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000
1547
A Wavelet-Based Coarse-to-Fine Image Matching
Scheme in A Parallel Virtual Machine Environment
Jane You, Member, IEEE, and Prabir Bhattacharya, Senior Member, IEEE
Abstract—We present a wavelet-based, high performance, hierarchical scheme for image matching which includes 1) dynamic
detection of interesting points as feature points at different levels
of subband images via wavelet transform, 2) adaptive thresholding
selection based on compactness measures of fuzzy sets in image feature space, and 3) a guided searching strategy for the best matching
from coarse level to fine level. In contrast to the traditional parallel approaches which rely on specialized parallel machines, we explored the potential of distributed systems for parallelism. The proposed image matching algorithms were implemented on a network
of workstation clusters using parallel virtual machine (PVM). The
results show that our wavelet-based hierarchical image matching
scheme is efficient and effective for object recognition.
Index Terms—Feature detection, guided matching, Hausdorff
distance, image matching, image pyramid, parallel algorithms,
parallel virtual machine (PVM), wavelet transform.
I. INTRODUCTION
I
MAGE matching, which measures the degree of similarity
between two image sets that are superimposed on one another, plays an important role in many areas such as pattern
recognition, image analysis and computer vision. The images to
be matched are required to go through a number of operations
before the similarity is determined. These operations include
feature extraction, distance transformation, similarity measurement and searching for the best match. Thus, an effective approach to image matching concerns with the following key issues: What kind of features are used for matching? What is the
criterion for best matching? How to find the best matching? (see
[20] for a survey up to 1993).
Based on the level of image feature extraction, the matching
algorithms developed in the past can be divided into three categories: pixel-based methods, low-level feature based methods
and high-level feature based methods. At the lowest level, image
pixel gray scale values are directly used by means of correlation
techniques to determine parallaxes or to measure coordinates of
sampled points in remote sensing [12], [19]. At the next level,
Manuscript received November 12, 1998; revised March 15, 2000. This work
was supported by a grant from the U.S. Ballistic Missile Defense Organization
(BMDO) through the DEPSCOR program with matching funds from the University of Nebraska-Lincoln (including the CCIS and CSCE) and by the Australian Research Council under Grant ARC Small 9606. The associate editor
coordinating the review of this manuscript and approving it for publication was
Prof. Kannan Ramchandran.
J. You is with the School of Computing and Information Technology, Griffith University, Queensland, Australia 4111 and also with the Department of
Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong (e-mail:
you@cit.gu.edu.au; csyjia@comp.polyu.edu.hk).
P. Bhattacharya is with the Department of Computer Science and Engineering, University of Nebraska, Lincoln, NE 68588-0115 USA (email:
prabir@cse.unl.edu).
Publisher Item Identifier S 1057-7149(00)06978-5.
image features such as points, lines and regions are extracted
from the image for the matching. Different techniques and
systems have been developed for various applications such as
object/terrain reconstruction [1], [18], moving objects tracking
[16], [34], and automatic map interpretation/revision [14]. At
the highest level more information about the interrelationships
between the features are also used for matching. In general the
topological and geometrical relations between features contain
important information to constrain the large space of possible
mappings between the features in three-dimensional (3-D)
object recognition, location [3], and navigation [35].
Although the pixel-based methods are simple to implement,
they are very sensitive to any changes between images and will
not be able to identify the same results structures in images from
different sensors. On the other hand, the high-level matching
methods are very insensitive to these changes, however, in most
cases the extraction and representation of the relationship itself
is a difficult problem. In the past, a number of matching algorithms had been developed based on the edge detection and distance transform. The typical examples are Chamfer matching
[2], Borgefors’ hierarchical Chamfer matching [5] and the Huttenlocher’s Hausdorff distance matching [22]. All of these algorithms involve the following steps in sequence: detecting edge
points, converting the original gray-scale images to binary edge
images, applying distance transform on the binary images, and
finally measuring the similarity between the template image and
the target image (including part of the target image).
Chamfer matching was first proposed by Barrow et al. [2]
for finding the best fit of edge points from two different images based on distance transform and distance minimization.
This method focused on describing how images featured by
their edge points can be geometrically disordered in relation to
one another by transformation in terms of a set of parametric
equations. Obviously, the application of such a technique is limited because a good start hypothesis of the transformation is required to map the edges. Nevertheless, this method shows its
ability to handle imperfect data and is useful for noisy and disordered images. Consequently, Borgefors [5] extended Chamfer
matching by introducing a new matching criterion and a resolution pyramid for edge matching, where the so-called root mean
square average was proposed as the measure of correspondence
between the objects to be matched. In order to speed up the
operation, the matching is performed in a multi-resolution series of images rather than in the original image resolution so
that the results at the low resolution will guide the computations at finer levels. Although this algorithm can be applied to
different matching tasks such as image to image matching, template to image matching and image to symbolic image matching,
1057–7149/00$10.00 © 2000 IEEE
1548
a drawback is that the object must be represented by a polygonal
sketch so as to conduct matching by distance transform. Huttenlocher et al. [22] studied the possibility of comparing images in
terms of Hausdorff distance, where the Hausdorff distance is
viewed as a function of the translation of a model with respect
to an image. Their test results show that the Hausdorff distance
transform is powerful for shape comparison, more tolerant of
perturbations in the locations of pixel points and applicable to
aerial images. However, the direct comparison method based
on the Hausdorff distance is time-consuming because it considers every possible translation of the model within the given
test image. Even though Huttenlocher et al. [22] have developed some good speed-up techniques for the computation of the
Hausdorff distance in matching, further improvement is highly
desirable. To our knowledge, there has not been a comprehensive wavelet-based image matching scheme so far despite some
work reported in [39].
Comparing a template image to a larger target image usually
requires that the matching algorithm assigns a value as to how
good the match is for every possible overlay position of the
template image over the target image. The position with the
best matching measurement such as the lowest distance is then
regarded as being the position of the best match. Although
this method can be highly accurate, depending on the adopted
matching algorithm, it is computationally expensive to search
for every possible combination position of the template window
within the target image. In our previous work [45], [47], we
extended the existing image matching algorithms [5], [22]
by introducing a guided searching scheme to quickly lead a
matching process to the most likely portion of the target image.
This scheme can speed up the matching process, especially
when the size ratio between a target image and a template
image is big. Apart from the guided hierarchical searching,
we replaced edge points with interesting points as image
feature points to reduce the redundant information because the
matching speed is proportional to the number of points which
need to be matched. In this paper, we propose a wavelet-based
hierarchical image matching scheme to achieve efficiency and
effectiveness, which includes
1) to apply wavelet transform to decompose an image into
different levels of subband images;
2) to detect interesting points as feature points at different
levels;
3) to introduce a guided searching procedure to search for
the best matching from coarse level to fine level.
It should be pointed out that thresholding plays an important role in our approach to detect interesting points at different
levels of subband images. A number of approaches to thresholding, including global, local and dynamic methods have been
proposed in the past. However, when the regions in an image are
ill-defined (i.e., fuzzy), some ideas based on fuzzy set have been
proposed by assuming the segments to be fuzzy subsets of the
image. Fuzzy geometric properties defined by Rosenfeld [40]
are useful for thresholding selection. Pal and Rosenfeld [38]
proposed a method to extract the fuzzy segmented version of
an ill-defined image by minimizing compactness and fuzziness
of the image in both the intensity and spatial domain. The advantages of such an algorithm has been demonstrated by consid-
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000
ering the ambiguity in grey level through the concepts of index
of fuzziness, entropy and index of nonfuzziness (crispness). We
further extended this fuzzy compactness approach to determine
interesting points in image feature domain.
The demand for higher performance, lower cost, and sustained productivity has inspired an ever-increasing interests in
parallel processing for both high-performance scientific computing and more “general-purpose” applications. The past several years have witnessed two major developments in parallel
computing: massively parallel processors (MPP’s) and the wide
applications of distributed computing. Although MPP’s provide
excellent computational power by combining a few hundred to
a few thousand CPUs in a single large cabinet connected to hundreds of gigabytes of memory, their applications are restricted
because of the high costs, typically more than 10 million U.S.
$ for large MPPs. In addition, specific software development
such as new programming paradigms, languages, scheduling
and partitioning techniques, and algorithms is required to fully
exploit the power of these massively parallel machines. In contrast, distributed computing involves a significantly lower cost
factor to solve computationally intensive problems efficiently
on a cluster of existing computers than on an individual computer. Distributed computing is a process whereby a set of computers connected via a network are used collectively to complete a single complicated task. As the advanced technology of
computer network has made very high-speed network available
(gigabit/s), more and more organizations have interconnected
many general-purpose workstations through such networks. It is
uncommon for distributed-computing users to achieve the raw
computational capacity of a large MPP, however, it is feasible to
solve grand challenge problems by combining a variety of distributed computing resources, linked by high-speed networks,
with flexibility, stability, scalability, and at low costs.
In this paper, we explored the potential of parallel vision
computing on a network of workstation clusters. We adopted
a divide-and-conquer policy to implement the proposed hierarchical image matching in a parallel virtual machine (PVM) computing environment [15], where a complex task is divided into
a number of sub-tasks and those sub-tasks are later reorganized
into clusters according to granularity before being mapped on
computers for simultaneous operation. There are several standards for distributed computing including PVM, P4, Express,
MPI, and Linda [13], [15]. A comprehensive summary of the
recent development of distributed system design is presented in
[44], where PVM is viewed as the existing de facto standard
for distributed computing and MPI is being considered as the
future message passing standard. PVM is a popular software
message package. It allows programmers to exploit distributed
computing across a wide variety of computer types, including
MPP’s. In other words, PVM makes a heterogeneous network
of computers to appear as one single concurrent computational
engine—a large virtual machine as stood for by a name. Because
of its virtual machine concept and its simple but complete programming interface, the PVM system has been widely accepted
in the high-performance scientific computing community. As
summarized in [15], the PVM computing model is featured as
simple but very general with a straightforward programming interface, which facilitates simple program structures to be im-
YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME
1549
plemented in an intuitive manner. Each application consists of
a collection of cooperating tasks which access PVM resources
through a library of standard interface routines. These routines
allow the initiation and termination of tasks across the network
as well as communication and synchronization between tasks.
The content of this paper is organized as follows: Section II
describes the detection of interesting points as feature points
throughout a dynamic thresholding scheme based on the compactness measures of fuzzy sets. Section III details an image
feature hierarchy by means of wavelet transform and Section IV
highlights a guided image matching scheme based on the Hausdorff distance. The parallel implementation and performance
evaluation is summarized in Sections V and VI respectively. Finally, the conclusion is presented in Section VII.
II. IMAGE FEATURE EXTRACTION
An important issue in feature-based image matching is
concerned with how to extract image feature points to represent
the original image effectively. This section presents a dynamic
threshold scheme to determine interesting points as feature
points based on the compactness measures of fuzzy sets.
A. Interesting Points, Not Edge Points
Most matching algorithms are based on binary images to
identify the interested object(s). Therefore, the original image,
either greyscale or color images, should be converted into a
binary image. The existing matching algorithms [2], [5], [22]
all applied edge detection to convert an original image into
a binary image. However, the redundant information in large
number of edge pixels limits the possible speedup in subsequent operations since the time consumption of a matching
algorithm is directly related to the number of feature points
used in matching. The more the points are, the more the time is
required. Therefore, it is desirable to use an algorithm which
only extracts those feature points that
• are representative and distinctive, such as corners, and
without redundancy;
• are robust to “noise.”
According to Haralick and Shapiro [20], ideally, the interesting points should satisfy the conditions of
1) distinctness;
2) invariance;
3) stability;
4) uniqueness;
5) interpretability.
A number of interesting point operators have been developed
(see [20] for a survey)—these include, e.g., Plessey operator
[36] and Moravec operator [34]. The detection of interesting
points is based on the measure of how interesting a point is.
Interesting here has its own special meanings depending on different applications. In order to reduce the number of points
during matching while still reserve the feature of the original
image, such an interesting point should be distinguishable from
immediate neighbors, which excludes points sitting on the same
edge. In general, the operation of the existing interesting point
Fig. 1. Comparison of image feature extraction operators.
operators can be summarized as a three-step procedure which is
clearly highlighted in [20].
• Selection of optimal windows. The selection is based on
the average gradient magnitude within a window of prespecified size. Search for local maxima, while suppressing
windows on edges, guarantees (local) distinctness. The
measure used is also invariant with respect to rotation.
• Classification of the image function within the selected
windows. The classification distinguishes between types
of singular points such as corners, rings, spirals, and even
isotropic texture based on a statistical test.
• Estimation of the optimal point within the window as the
classification. The estimation is precise for corners and for
the centers of circular symmetric features or spirals.
Moravec [34] suggested that a point is considered interesting
if it has local maximum of minimal sums of directional variances, while the principle underlying the Plessey operator is to
evaluate a given ratio to mark the detected pixel based on its
first-difference approximations to the partial derivatives within
a local window. Fig. 1 shows the comparison of edge points and
interesting points to represent the original image. Fig. 1(a) is
ranging from
a histogram equalized image of size
0 to 255 in gray scale, Fig. 1(b) shows edge points detected
by Prewitt operator, Fig. 1(c) shows interesting points detected
by Moravec operator, and Fig. 1(d) shows interesting points detected by Plessey operator. Since the complexity of computation
for matching depends on the image feature pixels, it is clear that
1550
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000
Fig. 2. Comparison of feature pixels in a textured image.
interesting points detected by Plessey operator are more suitable
for matching because they reserve the image features with less
points.
The conventional interesting point detection techniques mentioned above were based on the detection of discontinuities in
pixel properties. Thus they are not suitable for textured images because texture is characterized by its local features over
some neighborhood rather than in pixel gray scale. We further
extended the concept of interesting point for general-purpose
images. Instead of applying interesting point operators to the
original gray-scale image, the detection of interesting points
was performed on the feature image converted from the original image. For the case of textured images, we adopted Laws’
texture energy measurement [27] to convert a textured image to
a texture energy image for the detection of interesting points.
Fig. 2 compares textural image feature pixels represented by
edge points, texture energy, interesting points and texture interesting points respectively. Fig. 2(a) is a template image and
Fig. 2(b) is a two-texture composite image to be processed,
where the background is featured by Brodatz texture raffiad84
[6] and the object in the shape of the coast line of Australia
is filled with Brodatz texture pigskind92 [6]. Fig. 2(c) demonstrates edge points detected by Prewitt operator, Fig. 2(d) represents texture energy image using Laws’ R5R5 mask, Fig. 2(e)
shows interesting points detected by Moravec operator on the
original image shown in Fig. 2(b) and 2(f) presents texture interesting points detected by Plessey operator on the texture energy image shown in Fig. 2(d). Obviously, the texture interesting
points best represent the original image with less points.
B. Optimal Thresholding via Fuzzy Compactness
The identification of interesting points is based on the selection of threshold. When a feature image is regarded as a fuzzy
and levels of feature measurements, we
set of size
extend Pal and Rosenfeld’s fuzzy compactness approach [38] to
determine the optimal threshold value by minimizing the measurement of fuzziness. They introduced the index of fuzziness
to reflect the average amount of ambiguity (fuzziness) present
in an image by measuring the distance between its fuzzy propand the nearest two-level property
. A linear index
erty
of fuzziness is defined as follows:
(1)
(2)
(3)
YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME
where
that
denotes the nearest two-level version of
1551
such
if
otherwise
In contrast to the conventional measurement of fuzziness in a
gray scale image detailed above, we extended this approach by
applying the distance measurement to feature image rather than
the measurement of fuzziness image in gray scale. Accordingly,
refers to the feature set and
represents the certain feature
measurement. Instead of processing the gray scale image, we
consider the minimization of fuzziness in image feature to select
appropriate threshold value for feature point detection. Such an
algorithm is summarized as follows.
• Step 1: Convert the gray scale image to feature image by
its interestingness measurement. The maximum and minand
.
imum values are
,
• Step 2: Construct the “feature image” membership
where
(4)
and
(5)
(6)
(7)
(8)
and bandwidth
.
• Step 3: Obtain the linear index of fuzziness in image feature image by computing
with cross-over point
Fig. 3. (a) Histogram of gray-scale distribution and (b) histogram of texture
energy distribution.
A. Discrete Wavelet Transform
(9)
where
(10)
denotes the number of occurrences of the level .
and
to
and choose
which
• Step 4: Vary from
.
corresponding to the minimum of
The advantage of the use of fuzzy set for optimal thresholding
in image feature detection is demonstrated in Fig. 3 by comparing histogram distribution of the selected features. For the
given image shown in Fig. 2(b), Fig. 3(a) is the gray scale distribution and Fig. 3(b) is the texture energy distribution. Our test
results confirm that the histograms of feature distribution (e.g.,
textures) provide the base for optimal thresholding for feature
detection.
Wavelet transforms offer the promise of compact representation and efficient detection of image components that match
the wave-shape of the chosen wavelet [31]. Given the set of or, the wavelet series expansion of the
thonormal wavelets
is
band limited continuous function
(11)
where the coefficients
are determined by
(12)
, and
. Simis
ilarly, the discrete transform of the sampled function
defined as
(13)
III. A WAVELET-BASED IMAGE FEATURE HIERARCHY
A key issue in hierarchical image matching is how to create
an image feature pyramid for matching. Unlike most of the existing matching algorithms which detect edge points and perform matching in a resolution pyramid, we apply a wavelet
transform to decompose an image into a series of subband images and replace edge points with interesting points for an image
feature hierarchy.
where the coefficients
are
(14)
for
.
,
1552
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000
Fig. 4.
DWT image decomposition step
The concept of the above one dimensional wavelet representation can be extended to two dimensions easily [8]. Considering
the case of unitary image transforms, we assume that the two-dimensional (2-D) scaling function is separable, that is,
(15)
is a one-dimensional (1-D) scaling function. If
where
is its companion wavelet expressed by
(16)
where
(17)
(18)
(19)
(20)
for
wavelets
,
, then the three 2-D basic
establish the foundation for a 2-D wavelet transform. Fig. 4
shows the DWT image decomposition step and Fig. 5 illustrates
subimages after three-level DWT image decomposition.
B. An Image Pyramid via Wavelet Transform
Multiresolution of images is a fundamental concept in
computer vision [25], [26], [41], [43]. By analyzing an image
at different resolutions and increasing the resolution from
coarse to fine, we may understand the content of an image in
a hierarchical manner [43]. This technique has been applied
widely in low-level vision and pattern recognition. An image
pyramid is a structure that stores not only the original image,
but also a collection of related images derived from the origin.
A resolution pyramid will consist of a number of versions of the
original image in lower resolutions, where the resolution and
density are decreased in a step-by-step format. Each successive
member of the sequence is obtained by a filtering followed
by a sampling operator. There are three types of pyramids
[20]: Gaussian, Laplacian, and morphological. In the Gaussian
pyramid, a Gaussian-like kernel is convolved with a member of
the pyramid to obtain the successive level member, followed by
sampling. In a Laplacian pyramid, each layer is the Laplacian of
the corresponding layer of the Gaussian pyramid, followed by
sampling. In a morphological pyramid, each layer is obtained
by a morphological operation (opening and closing) on the
previous level, followed by sampling. Examples are given in
[25], where the original image is level zero of the pyramid.
To reduce the computational complexity for matching, it is essential to perform matching in a hierarchical structure, starting
from the coarse level with less feature points to the fine level
with more points for precise matching. Previously we proposed
to create an interesting point pyramid by selecting interesting
points through a dynamic thresholding scheme guided by the
histogram [45]. For a given original image, a so-called interest
image is obtained by applying interesting point detector Plessey
operator to the image. Each pixel value of this image actually
represents the interest degree of the corresponding pixel in the
original image. If a certain threshold value is determined, those
pixels whose values are below the selected threshold value will
be considered as interesting points and set to 1 in the image
feature map while those above threshold value will be ruled
out and assigned to 0 in the feature map. Depending on each
particular application, different threshold values for different
levels are determined by the histogram of the interest image
from lower value for fewer interesting points to higher value
for more points. Fig. 6 shows an example of such an interesting
points pyramid.
The multiresolution approach to image decomposition using
a wavelet orthonormal basis and a pyramidal algorithm to
compute it was introduced by Mallat [29] who also considered
its applications to data compression, texture discrimination
and fractal analysis. Later, Mallat [30] developed an iterative
reconstruction algorithm from the zero-crossings. A general
multiresolution formulation of wavelet systems is discussed
in [11]. In contrast to the traditional image pyramid for hierarchical matching, the image feature pyramid reported in this
paper is generated based on the detection of interesting points
on each subband images via wavelet transform. For a given
original image, a wavelet transform is first applied to decompose the image into a series of subband images. The so-called
interest image pyramid is then obtained by applying interesting
point detector Plessey operator to the subband images. Fig. 7
shows an example of such an interesting points pyramid.
Fig. 7(a)–(c) represents the original image, the subband images
and interest image pyramid respectively. It is also observed
that the combination of interesting points at the same level
of decomposed subband images provides the better feature
representation than the direct interesting point detection on the
YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME
Fig. 5.
1553
Example of three-level image decomposition.
Fig. 6. Example of interesting points pyramid.
Fig. 7. Image feature pyramid.
original images. Such a claim is demonstrated in Fig. 8, where
Fig. 8(a) is the original image, Fig. 8(b) contains interesting
points detected by Plessey operator on the original image,
and Fig. 8(c) displays the combination of interesting points
detected by Plessey operator on three subband images after the
first level decomposition.
1554
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000
Fig. 8. Comparison of interesting points: original versus subband combination.
IV. HIERARCHICAL IMAGE MATCHING
The hierarchical guided matching scheme was first introduced by Borgefors [5] in order to reduce the computation
cost required to match two images. We extended it by using
interesting points rather than edge points in a similar fashion,
i.e., an interesting point pyramid is created and the matching
starts from the lowest resolution and the results of this match
guides the search on the possible area of the higher resolutions.
In addition, we applied the wavelet transform to generate
the interesting point pyramid. Furthermore, we extended the
matching process by using the Hausdorff distance as a measure
of similarity. The advantage of the Hausdorff distance is that
it allows us to partition the target image into a number of
subimages and simultaneously match the template image on
these subimages. According to Huttenlocher et al. [22], we are
able to use Hausdorff distance algorithm to search for portions,
or partially hidden objects.
A. Hausdorff Distance
The Hausdorff distance is a nonlinear operator which measures the mismatch of the two sets. In other words, such a distance determines the degree of the mismatch between a model
and an object by measuring the distance of the point of a model
that is farthest from any point of an object and vice versa. Therefore, it can be used for object recognition by comparing two images which are superimposed on one another. The key points
regarding this technique are summarized as follows.
and
Given two finite point sets
, the Hausdorff distance
between them is defined as
(21)
where
is the distance from set
to set
expressed as
(22)
while
is the distance from point
to set
given by
(23)
Fig. 9. Graphical example of computing Hausdorff distance (H (A;
max(h(A;
B)
=
B ); h(B; A))):
Obviously, the Hausdorff distance
is the maximum of
and
which measures the degree of mismatch between two
sets and . Fig. 9 presents a graphical example of computing
the Hausdorff distance.
In general, image data are derived from a raster device and
represented by grid points as pixels. For a feature detected
and
can be
image, the characteristic function of the set
and
, respectively,
represented by binary arrays
th entry in the array is nonzero for in the array
where the
is nonzero for the corresponding feature pixel in the given
and
are used
image. Therefore, distance arrays
the distance to the
to specify for each pixel location
YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME
Fig. 10.
Structure of hierarchical image matching.
nearest nonzero pixel of
or , respectively, where
denotes the distance transform of and
denotes the
distance transform of . Consequently, the Hausdorff distance
as a function of translation can be determined by computing
and
array
the pointwise maximum of all the translated
in the form of
(24)
where
(25)
(26)
quer policy and implemented the proposed hierarchical image
matching using PVM on a network of workstation clusters. The
complicated image matching task is divided into a number of
subtasks and those sub-tasks are later reorganized into clusters
according to granularity before being mapped on computers for
simultaneous operation. Fig. 11 shows the SIMD architecture
topology adopted in this paper.
Data must be exchanged between cooperating tasks in all
parallel processing. The PVM software provides a general
and flexible parallel computing environment which enables
concurrent executions on loosely coupled networks of processing elements and supports a message-passing model of
synchronization. It offers a unified framework within which
parallel programs can be developed in an efficient and straightforward way without any specific hardware requirements. It
transparently handles all message routing, data conversion, and
task scheduling across a network of incompatible computer
architectures. Fig. 12 shows a simplified architecture paradigm
of the PVM system, and Fig. 13 is a generic model of message-passing adopted by PVM.
The success of the parallel implementation using PVM on
a cluster of workstations depends on the effective communication pattern between workstations. An image is partitioned
into subimages and distributed over nodes on the network. The
local processing results such as two-dimensional matrix operation, feature detection and transformation for each subimage
are merged over the entire image. In this paper, the master/slave
parallel programming paradigm is adopted. The software developed includes a master program and slave programs. The master
program controls the data processed by a number of slave programs. It can run on one machine and spawn copies of the slave
programs to any number of nodes in the network. The following
presents a brief description of the master and slave control programs:
Master Program:
•
•
•
•
•
B. A Guided Matching Scheme
In order to avoid the blind searching for the best fit between
the given patterns, a guided search strategy is essential to reduce
computation burden. Our extension of the hierarchical image
matching scheme was based on a guided searching algorithm
that searches first at the low level, coarse grained images, to the
high level, fine grained images. To do this we needed to obtain
a Hausdorff distance approximation for each possible window
combination of the template and target image at the lowest resolution. Those that returned a Hausdorff distance approximation
equal to the lowest Hausdorff distance for those images were
investigated at the higher resolution. Fig. 10 shows the structure
of the proposed hierarchical matching.
V. SPEEDING UP IMAGE MATCHING USING PVM
In contrast to the conventional parallel solutions which relied
on specialized parallel machine, we adopted a divide-and-con-
1555
•
•
•
•
•
initialize the system;
allocate a memory array for the image data;
read an image into the memory array;
segment the image into a number of subimages;
register in the PVM process by obtaining its task identifier
(TID);
spawn copies of the slave programs on the slave machines;
send control messages to slave machines;
send subimage data to slave machines;
wait for the slave machines to report back;
collect data from slave machines for final output.
Slave Program:
• slave programs are spawned by the master program at runtime;
• initialize the data structure for the processed image segments;
• registers in the PVM process and obtains its TID;
• wait for data to be sent from the master;
• receive the data and perform the specified operation;
• inform the master machine when the result data is ready;
• send the data to the master;
• exit from PVM process.
1556
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000
Fig. 11.
Fig. 12.
Fig. 13.
SIMD architecture topology.
Paradigm of PVM architecture model.
Generic model of a message-passing multicomputer.
In this paper, we adopted a divide and conquer approach and
used the above master/slave parallel programming paradigm
to implement the proposed hierarchical image matching on a
cluster of workstation clusters.ifferent message size. The main
steps are outlined as follows.
• Step 1) Discrete wavelet transform.
1.1) master node reads in original image;
1.2) master node segments the image data and sends
each subimage data to individual slave node;
1.3) each slave node performs wavelet transform under
the coordination of the master node;
1.4) master node collects the result from slaves and generate an image pyramid of subband images.
• Step 2) Creation of interest image pyramid.
2.1) master node reads in original image;
2.2) master node assigns each subband image to slave
node;
2.3) each slave node performs Plessey operator to detect
interesting points of the individual subband image;
2.4) master node collects the data from each slave node
and combines them together to obtain an interest image
pyramid.
• Step 3) Distance transform.
3.1) master node assigns one quarter of the partitioned
binary image to each corresponding slave node;
3.2) each slave node initializes the subregion of the distance image based on the binary image;
3.3) each slave node applies 3-4 DT mask to perform
distance transform and the subregion of the final distance image is created;
3.4) master node collects the processed data from each
slave node and combines them to create the distance
image.
• Step 4) Matching procedure.
4.1) master node assigns one quarter of the partitioned
distance image and the interest image of the template to
each corresponding slave node;
4.2) each slave node imposes the interest image of the
template over the distance image and calculate the root
mean square average for optimization;
4.3) each slave node passes a number of possible areas
to the master node for further comparison.
• Step 5) Master node outputs the final result of matching.
YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME
1557
TABLE I
TIME TO PACK AND SEND MESSAGES WITH DIFFERENT SIZES
VI. PERFORMANCE EVALUATION
To evaluate the performance gain, a system was implemented
on a group of networked workstations, where PVM was used
to provide a parallel execution environment. Based on the
developed system, a number of experiments were carried out to
measure the effectiveness of using the hierarchical approach in
matching, and to measure the speedup of using parallel guided
matching against using sequential matching. The impact of
communication overheads in relation to message size is also
investigated. The program codes were written in ANSI standard
C and executed on a network of SPARC ELC stations under the
UNIX operating system.
Fig. 14. Average time to pack and send a message versus message size. (The
discrete wavelet transforms of 512 512 images.)
2
A. PVM Overheads versus Message Size
In general, the inter-process communication overheads will
have negative impacts on system efficiency. Our concern is how
to maximize the gain from parallel processing in terms of the
data size, the bandwidth of the communication channel, and
the complexity of algorithms. The time for the PVM process
to pack, transmit and unpack data from the master to any slave
is referred to as the PVM overheads, which is determined by the
PVM process and various communication factors such as node
link length, composition, topology and so on. It is desirable to
determine the optimal data size with minimum overheads effects. Table I lists different time required for data packing and
sending in terms of different sizes for the discrete wavelet transimages. Fig. 14 depicts the average time to
forms of
pack and send a message versus the message size and Fig. 15
depicts the average number of message bytes packed and sent
per microsecond with respect to d
B. Parallel Image Matching
In our PVM-based implementation with a master/slave
model, the tasks for the master node include reading and
segmenting image data, sending sub-image data to each slaves,
coordinating the relevant results produced by slaves, collecting
the result from slaves for the final output. Table II shows
different execution time with different number of slaves for
image based on pyramid
the wavelet transform on
algorithm.
In order to show the advantages and potentials of PVM for
parallel image processing in a distributed computing environment, the performance of both sequential and parallel implementation of the proposed image matching algorithm based on
detecting interesting points by Moravec operator and Plessey
Fig. 15. Average speed for message packing and sending. (The discrete
wavelet transforms of 512 512 images.)
2
TABLE II
WAVELET TRANSFORM: THE EXECUTION TIME
THE NUMBER OF SLAVES
VERSUS
operator is compared. In addition, the parallel performance improvement is measured by the following ratio [23]:
(27)
where refers to the sequential execution time and refers to
the parallel execution time. Table III lists the performance evaluation in terms of the average execution time of two operations
for the detection of interesting points on different images with
to
. The sequenvarious sizes ranging from
tial processing is performed on a single SPARC ELC station
while the parallel implementation is on a four-node structure
using SPARC ELC stations. These experimental results indicate
that in general the parallel implementation is faster than the sequential one even for a simple operation on a small size image
image). The advantage
(e.g. Moravec operation on a
of parallelism becomes more obvious when the test image size
grows larger and the adopted algorithm gets more complicated.
1558
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000
TABLE III
THE PERFORMANCE COMPARISON: SEQUENTIAL VERSUS PARALLEL
TABLE IV
HIERARCHICAL IMAGE MATCHING SCHEME
TABLE V
MATCHING A ROTATED TEMPLATE IMAGE
in image matching is to find an efficient and effective approach
to characterize image features and measure the degree of similarity between two images that are superimposed on one another. We proposed a wavelet-based hierarchical scheme to facilitate coarse-to-fine image matching to achieve effectiveness
and efficiency. Unlike the traditional matching process in which
edge pixels are considered as essential image feature points in
distance transform, our hierarchical image matching was performed on an interesting points image pyramid, where a wavelet
transform was applied to decompose an image into different
levels of subband images and interesting points were detected
at each level based on optimal thresholding via fuzzy compactness. In addition, the searching for the best match was guided
from the coarse-level in the pyramid to the fine-level throughout
the hierarchical structure.
It is noted that parallelism provides more powerful computing ability for image processing systems where simple
operations on large set of data is required. The advanced
technology of computer network has made very high-speed
network available. Consequently, such a distributed computer
system has the potential to parallelize some image processing
tasks for high speed. The innovation of our approach lies in the
parallel implementation of the hierarchical matching scheme
on a low cost heterogeneous PVM network on the basis of a
divide-and-conquer policy. The experimental results confirm
the effectiveness of the proposed method.
There is still some work which requires further investigation
in order to reduce the time consumption to a level of real-time.
One of the possibilities is to use distributed shared memory
(DSM) systems for image matching. DSM systems provide
a similar interface as a multiprocessor, but with distributed
memory. This work intends to study the suitability of DSM
for image matching. It is because image matching has very
low rate of data modification, hence, it should be suitable to a
DSM system. In addition, a task scheduling facility is under
concern which intends to appropriately distribute tasks among
computers and aims to minimize the execution time.
ACKNOWLEDGMENT
The authors would like to thank an anonymous reviewer for
the constructive comments made. The authors would like to
thank M. Hitchings and S. Kundu for the assistance of some
program codes.
Fig. 16.
Object recognition via image matching.
Tables IV and V compare the execution time for matching.
Fig. 16 shows an example of object identification and localization using our hierarchical matching scheme. Fig. 16(a) is
target image with certain object to be identified and
a
Fig. 16(b) shows the matching result of our hierarchical scheme,
which returns a match position.
VII. CONCLUSION
Image matching plays an important role in pattern recognition, image analysis and computer vision. The central problem
REFERENCES
[1] S. T. Barnard and W. B. Thompson, “Disparity analysis of images,”
IEEE Trans. Pattern. Anal. Machine Intell., vol. PAMI-2, pp. 333–340,
1980.
[2] H. G. Barrow, J. M. Tenenbaum, R. C. Bolles, and H. C. Wolf, “Parametric correspondence and chamfer matching: Two new techniques for
image matching,” in Proc. 5th Int. Joint Conf. Artificial Intelligence,
Cambridge, MA, 1977, pp. 659–663.
[3] T. O. Binford, “Survey of model-based image analysis,” Int. J. Robot.
Res., vol. 1, pp. 18–64, 1982.
[4] G. Borgefors, “Distance transforms in digital images,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-34, pp. 344–371, 1986.
[5]
, “Hierarchical chamfer matching: A parametric edge matching
algorithm,” IEEE Trans. Pattern Anal. Machine Intell., vol. 10, pp.
849–865, 1988.
[6] P. Brodatz, Textures: A Photographic Album for Artists and Designers. New York: Dover, 1966.
YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME
[7] K. R. Castleman, Digital Image Processing. Englewood Cliffs, NJ:
Prentice-Hall, 1996.
[8] C. K. Chui, An Introduction to Wavelets. New York: Academic, 1992.
[9] J. R. Corbib, The Art of Distributed Applications. Berlin, Germany:
Springer-Verlag, 1991.
[10] P. E. Danielsson, “Euclidean distance mapping,” Comput. Graph. Image
Process., vol. 14, pp. 227–248, 1980.
[11] I. Daubechies, Ten Lectures on Wavelets. Philadephia, PA: SIAM,
1992.
[12] W. Forstner and A. Pertl, “Photogrammetric standard methods and
digital image matching techniques for high precision surface measurements,” in Pattern Recognition in Practice II, E. S. Gelsema and I. N.
Kanal, Eds. Amsterdam, The Netherlands: Elsevier, 1986, pp. 57–72.
[13] MPI Forum, “MPI: A message-passing interface standard,” Int. J. Supercomput. Applicat., vol. 8, pp. 165–416, 1994.
[14] P. Fua and A. J. Hanson, “Resegmentation using generic shape: Locating general cultural objects,” Pattern Recognit. Lett., vol. 5, no. 3,
pp. 243–252, 1987.
[15] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam, PVM: Parallel Virtual Machine, A User’s Guide and Tutorial for
Network Computing. Cambridge, MA: MIT Press, 1994.
[16] D. B. Gennery, “A stereo vision system for autonomous vehicle,” in
Proc. 5th Int. Joint Conf. Artificial Intelligence, Cambridge, MA, 1977.
[17] D. Gerogiannis, “Load balancing requirements in parallel implementations of image feature extraction tasks,” IEEE Trans. Parallel Distrib.
Syst., vol. 4, no. 9, pp. 994–1013.
[18] M. Hahn, “Automatic measurement of digital terrain models by means
of image matching techniques,” in Proc. 42nd Photogrammetric Week:
Stuttgart Univ., 1989, vol. 13, pp. 141–151.
[19] U. V. Helava, “Digital correlation in photogrammetric instruments,” Int.
Arch. Photogram. Remote Sensing, vol. 20, no. 2, 1976.
[20] R. Haralick and L. G. Shapiro, Computer and Robot Vision. Reading,
MA: Addision-Wesley, 1993, vol. 2, ch. 16.
[21] “Lehrstuhl fur Photogrammetrie und Fernerkundung,” Techniche
University, Munich, Germany, http://dgrwww.epfl.ch/PHOT/publicat/wks96/Art=3=1.html.
[22] D. P. Huttenlocher, G. A. Klanderman, and W. J. Rucklidge, “Comparing
images using the Hausdorff distance,” IEEE Trans. Pattern Anal. Machine Intell., vol. 15, pp. 850–863, 1993.
[23] K. Hwang, Advanced Computer Architecture—Parallelism, Scalability,
and Programmability. New York: McGraw-Hill, 1993.
[24] A. K. Jain, “A sinusoidal family of unitary transforms,” IEEE Trans.
Pattern Anal. Machine Intell., vol. PAMI-1, pp. 356–365, 1979.
[25] J. Jolion and A. Rosenfeld, A Pyramid Framework for Early Vision: Multiresolutional Computer Vision. Norwell, MA: Kluwer, 1994.
[26] J. Koenderink, “The structure of images,” in Biological Cybernetics. New York: Springer-Verlag, 1984.
[27] K. I. Laws, “Textured image segmentation,” Ph.D dissertation, Univ.
Southern. Calif., Los Angeles, 1980.
[28] J. Magarey and N. Kingsbury, “Motion estimation using a complexvalued wavelet transform,” IEEE Trans. Signal Processing, vol. 46, pp.
1069–1084, Apr. 1998.
[29] S. G. Mallat, “A theory for multiresolution signal decomposition: The
wavelet representation,” IEEE Trans. Pattern Anal. Machine Intell., vol.
11, pp. 676–693, 1989.
[30] S. Mallat, “Zero-crossing of a wavelet transform,” IEEE Trans. Inform.
Theory, vol. 37, no. 4, pp. 1019–1033, 1991.
, A Wavelet Tour of Signal Processing. New York, CA: Academic,
[31]
1998.
[32] M. Maresca, M. A. Lavin, and H. W. Li, “Parallel architectures for vision,” Proc. IEEE, vol. 76, no. 8, pp. 970–981, 1988.
[33] D. Marr and E. Hildreth, “Theory of edge detection,” Proc. R. Soc. Lond.
B, vol. 207, pp. 187–217, 1980.
[34] H. P. Moravec, “Toward automatic visual obstacle avoidance,” in Proc.
5th Int. Joint Conf. Artificial Intelligence, Cambridge, MA, 1977, p. 584.
[35] R. Nevatia and K. E. Price, “Locating structures in aerial images,” IEEE
Trans. Pattern Anal. Machine Intell., vol. PAMI-4, pp. 476–484, 1982.
[36] J. A. Noble, “Finding corners,” Image Comput., vol. 6, no. 2, pp.
121–128, 1988.
[37] D. W. Paglieroni, “Distance transforms: Properties and machine vision
applications,” Comput. Vis. Graph. Image Process., vol. 54, no. 1, pp.
56–74, 1992.
[38] S. K. Pal and A. Rosenfeld, “Image enhancement and thresholding by
optimization of fuzzy compactness,” Pattern Recognit. Lett., vol. 7, pp.
77–86, 1988.
1559
[39] H. Pan, “Uniform full-information image matching using complex
wavelets,” Germ. J. Photogram. Remote Sensing, vol. 64, pp. 93–100,
1996.
[40] A. Rosenfeld, “The fuzzy geometry of image subsets,” Pattern Recognit.
Lett., vol. 2, pp. 311–317, 1984.
[41] J.-L. Starck, F. Murtagh, and A. Bijaoui, Image Processing and Data
Analysis: The Multiscale Approach. New York: Cambridge Univ.
Press, 1998.
[42] R. Voelpel, Parallel Computing: Fundamentals, Applications and New
Directions. Amsterdam, The Netherlands: Elsevier, 1998.
[43] A. Witkin, D. Terzopoulos, and M. Kass, “Signal matching through scale
space,” Int. J. Comput. Vis., vol. 1, pp. 133–144, 1987.
[44] J. Wu, Distributed System Design. Boca Raton, FL: CRC, 1999.
[45] J. You, E. Pissaloux, J. L. Hellec, and P. Bonnin, “A guided image
matching approach using Hausdorff distance with interesting points
detection,” in Proc. IEEE Int. Conf. Image Processing (ICIP’94),
Austin, TX, 1994, pp. 968–972.
[46] J. You, E. Pissaloux, W. P. Zhu, and H. A. Cohen, “Efficient image
matching: A hierarchical Chamfer matching scheme on a distributed
system,” Real-Time Imag., vol. 1, pp. 245–259, 1995.
[47] J. You, P. Bhattacharya, and S. Hungenahally, “Real-time object recognition: Hierarchical image matching in a parallel virtual machine environment,” in Proc. 14th Int. Conf. Pattern Recognition (ICPR’98), vol.
1, Brisbane, Australia, 1998, pp. 275–277.
Jane You (M’94) received the B. Eng. degree in electronic engineering from Xi’an Jiaotong University,
China, in 1986 and the Ph.D. degree in computer science from La Trobe University, Australia, in 1992.
From 1993 to 1995, she was a Lecturer with the
School of Computer and Information Science, University of South Australia. In 1993, she was awarded
a French Foreign Ministry International Fellowship
and was a Team Member of the Real-Time Object
Recognition Project at Université Paris XI, Paris,
France. She joined the School of Computing and
Information Technology, Griffith University, Queensland, Australia, in 1996.
Currently, she is a Senior Lecturer at Griffith University and an Academic
Staff Member at the Hong Kong Polytechnic University. Her research interest
include visual information retrieval, image processing, pattern recognition,
multimedia systems, biometrics computing, and data mining.
Prabir Bhattacharya (SM’92) received the B.A.
and M.A. degrees from the University of Delhi,
India, and the D.Phil. degree in mathematics in
1979 from the University of Oxford, Oxford, U.K.,
specializing in group theory. He
He is a Full Professor with the Department of
Computer Science and Engineering, University of
Nebraska, Lincoln, which he joined in 1986. In
August 1999, he joined (on a leave of absence from
the University of Nebraska) the Panasonic Information and Networking Technologies Laboratory,
Princeton, NJ, as a Principal Scientist and Project Leader. He is also currently
a Visiting Full Professor at the Center for Automation Research, University of
Maryland, College Park. He has authored/co-authored 74 journal papers, 45
conference papers, and also co-edited a book on vision geometry. His research
has been supported by many external agencies, such as AFOSR, NASA,
BMDO, and NIH. He has been on the editorial board of Pattern Recognition
since 1994.
During 1996–1998, he was on the editorial board of the IEEE Computer
Society Press. He is a Fellow of the Institute of Mathematics and Its Applications, U.K. He was a Distinguished Visitor of the IEEE Computer Society
during 1996–1999, and a National Lecturer of the Association for Computing
Machinery (ACM) during 1996–1997. He was the Chairman of the Nebraska
Chapter of the IEEE Computer Society during 1995–1997. He received an award
from the IEEE Nebraska Chapter for the Highest Involvement in 1997. He has
served in the program committees of many conferences, including the IEEE
Image Processing Conferences during 1997–1999, and the IEEE Computer Vision and Pattern Recognition Conference, 1999. He is listed in Who’s Who in
Science and Engineering, and Who’s Who in Multimedia and Communications.
Download