IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000 1547 A Wavelet-Based Coarse-to-Fine Image Matching Scheme in A Parallel Virtual Machine Environment Jane You, Member, IEEE, and Prabir Bhattacharya, Senior Member, IEEE Abstract—We present a wavelet-based, high performance, hierarchical scheme for image matching which includes 1) dynamic detection of interesting points as feature points at different levels of subband images via wavelet transform, 2) adaptive thresholding selection based on compactness measures of fuzzy sets in image feature space, and 3) a guided searching strategy for the best matching from coarse level to fine level. In contrast to the traditional parallel approaches which rely on specialized parallel machines, we explored the potential of distributed systems for parallelism. The proposed image matching algorithms were implemented on a network of workstation clusters using parallel virtual machine (PVM). The results show that our wavelet-based hierarchical image matching scheme is efficient and effective for object recognition. Index Terms—Feature detection, guided matching, Hausdorff distance, image matching, image pyramid, parallel algorithms, parallel virtual machine (PVM), wavelet transform. I. INTRODUCTION I MAGE matching, which measures the degree of similarity between two image sets that are superimposed on one another, plays an important role in many areas such as pattern recognition, image analysis and computer vision. The images to be matched are required to go through a number of operations before the similarity is determined. These operations include feature extraction, distance transformation, similarity measurement and searching for the best match. Thus, an effective approach to image matching concerns with the following key issues: What kind of features are used for matching? What is the criterion for best matching? How to find the best matching? (see [20] for a survey up to 1993). Based on the level of image feature extraction, the matching algorithms developed in the past can be divided into three categories: pixel-based methods, low-level feature based methods and high-level feature based methods. At the lowest level, image pixel gray scale values are directly used by means of correlation techniques to determine parallaxes or to measure coordinates of sampled points in remote sensing [12], [19]. At the next level, Manuscript received November 12, 1998; revised March 15, 2000. This work was supported by a grant from the U.S. Ballistic Missile Defense Organization (BMDO) through the DEPSCOR program with matching funds from the University of Nebraska-Lincoln (including the CCIS and CSCE) and by the Australian Research Council under Grant ARC Small 9606. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Kannan Ramchandran. J. You is with the School of Computing and Information Technology, Griffith University, Queensland, Australia 4111 and also with the Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong (e-mail: you@cit.gu.edu.au; csyjia@comp.polyu.edu.hk). P. Bhattacharya is with the Department of Computer Science and Engineering, University of Nebraska, Lincoln, NE 68588-0115 USA (email: prabir@cse.unl.edu). Publisher Item Identifier S 1057-7149(00)06978-5. image features such as points, lines and regions are extracted from the image for the matching. Different techniques and systems have been developed for various applications such as object/terrain reconstruction [1], [18], moving objects tracking [16], [34], and automatic map interpretation/revision [14]. At the highest level more information about the interrelationships between the features are also used for matching. In general the topological and geometrical relations between features contain important information to constrain the large space of possible mappings between the features in three-dimensional (3-D) object recognition, location [3], and navigation [35]. Although the pixel-based methods are simple to implement, they are very sensitive to any changes between images and will not be able to identify the same results structures in images from different sensors. On the other hand, the high-level matching methods are very insensitive to these changes, however, in most cases the extraction and representation of the relationship itself is a difficult problem. In the past, a number of matching algorithms had been developed based on the edge detection and distance transform. The typical examples are Chamfer matching [2], Borgefors’ hierarchical Chamfer matching [5] and the Huttenlocher’s Hausdorff distance matching [22]. All of these algorithms involve the following steps in sequence: detecting edge points, converting the original gray-scale images to binary edge images, applying distance transform on the binary images, and finally measuring the similarity between the template image and the target image (including part of the target image). Chamfer matching was first proposed by Barrow et al. [2] for finding the best fit of edge points from two different images based on distance transform and distance minimization. This method focused on describing how images featured by their edge points can be geometrically disordered in relation to one another by transformation in terms of a set of parametric equations. Obviously, the application of such a technique is limited because a good start hypothesis of the transformation is required to map the edges. Nevertheless, this method shows its ability to handle imperfect data and is useful for noisy and disordered images. Consequently, Borgefors [5] extended Chamfer matching by introducing a new matching criterion and a resolution pyramid for edge matching, where the so-called root mean square average was proposed as the measure of correspondence between the objects to be matched. In order to speed up the operation, the matching is performed in a multi-resolution series of images rather than in the original image resolution so that the results at the low resolution will guide the computations at finer levels. Although this algorithm can be applied to different matching tasks such as image to image matching, template to image matching and image to symbolic image matching, 1057–7149/00$10.00 © 2000 IEEE 1548 a drawback is that the object must be represented by a polygonal sketch so as to conduct matching by distance transform. Huttenlocher et al. [22] studied the possibility of comparing images in terms of Hausdorff distance, where the Hausdorff distance is viewed as a function of the translation of a model with respect to an image. Their test results show that the Hausdorff distance transform is powerful for shape comparison, more tolerant of perturbations in the locations of pixel points and applicable to aerial images. However, the direct comparison method based on the Hausdorff distance is time-consuming because it considers every possible translation of the model within the given test image. Even though Huttenlocher et al. [22] have developed some good speed-up techniques for the computation of the Hausdorff distance in matching, further improvement is highly desirable. To our knowledge, there has not been a comprehensive wavelet-based image matching scheme so far despite some work reported in [39]. Comparing a template image to a larger target image usually requires that the matching algorithm assigns a value as to how good the match is for every possible overlay position of the template image over the target image. The position with the best matching measurement such as the lowest distance is then regarded as being the position of the best match. Although this method can be highly accurate, depending on the adopted matching algorithm, it is computationally expensive to search for every possible combination position of the template window within the target image. In our previous work [45], [47], we extended the existing image matching algorithms [5], [22] by introducing a guided searching scheme to quickly lead a matching process to the most likely portion of the target image. This scheme can speed up the matching process, especially when the size ratio between a target image and a template image is big. Apart from the guided hierarchical searching, we replaced edge points with interesting points as image feature points to reduce the redundant information because the matching speed is proportional to the number of points which need to be matched. In this paper, we propose a wavelet-based hierarchical image matching scheme to achieve efficiency and effectiveness, which includes 1) to apply wavelet transform to decompose an image into different levels of subband images; 2) to detect interesting points as feature points at different levels; 3) to introduce a guided searching procedure to search for the best matching from coarse level to fine level. It should be pointed out that thresholding plays an important role in our approach to detect interesting points at different levels of subband images. A number of approaches to thresholding, including global, local and dynamic methods have been proposed in the past. However, when the regions in an image are ill-defined (i.e., fuzzy), some ideas based on fuzzy set have been proposed by assuming the segments to be fuzzy subsets of the image. Fuzzy geometric properties defined by Rosenfeld [40] are useful for thresholding selection. Pal and Rosenfeld [38] proposed a method to extract the fuzzy segmented version of an ill-defined image by minimizing compactness and fuzziness of the image in both the intensity and spatial domain. The advantages of such an algorithm has been demonstrated by consid- IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000 ering the ambiguity in grey level through the concepts of index of fuzziness, entropy and index of nonfuzziness (crispness). We further extended this fuzzy compactness approach to determine interesting points in image feature domain. The demand for higher performance, lower cost, and sustained productivity has inspired an ever-increasing interests in parallel processing for both high-performance scientific computing and more “general-purpose” applications. The past several years have witnessed two major developments in parallel computing: massively parallel processors (MPP’s) and the wide applications of distributed computing. Although MPP’s provide excellent computational power by combining a few hundred to a few thousand CPUs in a single large cabinet connected to hundreds of gigabytes of memory, their applications are restricted because of the high costs, typically more than 10 million U.S. $ for large MPPs. In addition, specific software development such as new programming paradigms, languages, scheduling and partitioning techniques, and algorithms is required to fully exploit the power of these massively parallel machines. In contrast, distributed computing involves a significantly lower cost factor to solve computationally intensive problems efficiently on a cluster of existing computers than on an individual computer. Distributed computing is a process whereby a set of computers connected via a network are used collectively to complete a single complicated task. As the advanced technology of computer network has made very high-speed network available (gigabit/s), more and more organizations have interconnected many general-purpose workstations through such networks. It is uncommon for distributed-computing users to achieve the raw computational capacity of a large MPP, however, it is feasible to solve grand challenge problems by combining a variety of distributed computing resources, linked by high-speed networks, with flexibility, stability, scalability, and at low costs. In this paper, we explored the potential of parallel vision computing on a network of workstation clusters. We adopted a divide-and-conquer policy to implement the proposed hierarchical image matching in a parallel virtual machine (PVM) computing environment [15], where a complex task is divided into a number of sub-tasks and those sub-tasks are later reorganized into clusters according to granularity before being mapped on computers for simultaneous operation. There are several standards for distributed computing including PVM, P4, Express, MPI, and Linda [13], [15]. A comprehensive summary of the recent development of distributed system design is presented in [44], where PVM is viewed as the existing de facto standard for distributed computing and MPI is being considered as the future message passing standard. PVM is a popular software message package. It allows programmers to exploit distributed computing across a wide variety of computer types, including MPP’s. In other words, PVM makes a heterogeneous network of computers to appear as one single concurrent computational engine—a large virtual machine as stood for by a name. Because of its virtual machine concept and its simple but complete programming interface, the PVM system has been widely accepted in the high-performance scientific computing community. As summarized in [15], the PVM computing model is featured as simple but very general with a straightforward programming interface, which facilitates simple program structures to be im- YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME 1549 plemented in an intuitive manner. Each application consists of a collection of cooperating tasks which access PVM resources through a library of standard interface routines. These routines allow the initiation and termination of tasks across the network as well as communication and synchronization between tasks. The content of this paper is organized as follows: Section II describes the detection of interesting points as feature points throughout a dynamic thresholding scheme based on the compactness measures of fuzzy sets. Section III details an image feature hierarchy by means of wavelet transform and Section IV highlights a guided image matching scheme based on the Hausdorff distance. The parallel implementation and performance evaluation is summarized in Sections V and VI respectively. Finally, the conclusion is presented in Section VII. II. IMAGE FEATURE EXTRACTION An important issue in feature-based image matching is concerned with how to extract image feature points to represent the original image effectively. This section presents a dynamic threshold scheme to determine interesting points as feature points based on the compactness measures of fuzzy sets. A. Interesting Points, Not Edge Points Most matching algorithms are based on binary images to identify the interested object(s). Therefore, the original image, either greyscale or color images, should be converted into a binary image. The existing matching algorithms [2], [5], [22] all applied edge detection to convert an original image into a binary image. However, the redundant information in large number of edge pixels limits the possible speedup in subsequent operations since the time consumption of a matching algorithm is directly related to the number of feature points used in matching. The more the points are, the more the time is required. Therefore, it is desirable to use an algorithm which only extracts those feature points that • are representative and distinctive, such as corners, and without redundancy; • are robust to “noise.” According to Haralick and Shapiro [20], ideally, the interesting points should satisfy the conditions of 1) distinctness; 2) invariance; 3) stability; 4) uniqueness; 5) interpretability. A number of interesting point operators have been developed (see [20] for a survey)—these include, e.g., Plessey operator [36] and Moravec operator [34]. The detection of interesting points is based on the measure of how interesting a point is. Interesting here has its own special meanings depending on different applications. In order to reduce the number of points during matching while still reserve the feature of the original image, such an interesting point should be distinguishable from immediate neighbors, which excludes points sitting on the same edge. In general, the operation of the existing interesting point Fig. 1. Comparison of image feature extraction operators. operators can be summarized as a three-step procedure which is clearly highlighted in [20]. • Selection of optimal windows. The selection is based on the average gradient magnitude within a window of prespecified size. Search for local maxima, while suppressing windows on edges, guarantees (local) distinctness. The measure used is also invariant with respect to rotation. • Classification of the image function within the selected windows. The classification distinguishes between types of singular points such as corners, rings, spirals, and even isotropic texture based on a statistical test. • Estimation of the optimal point within the window as the classification. The estimation is precise for corners and for the centers of circular symmetric features or spirals. Moravec [34] suggested that a point is considered interesting if it has local maximum of minimal sums of directional variances, while the principle underlying the Plessey operator is to evaluate a given ratio to mark the detected pixel based on its first-difference approximations to the partial derivatives within a local window. Fig. 1 shows the comparison of edge points and interesting points to represent the original image. Fig. 1(a) is ranging from a histogram equalized image of size 0 to 255 in gray scale, Fig. 1(b) shows edge points detected by Prewitt operator, Fig. 1(c) shows interesting points detected by Moravec operator, and Fig. 1(d) shows interesting points detected by Plessey operator. Since the complexity of computation for matching depends on the image feature pixels, it is clear that 1550 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000 Fig. 2. Comparison of feature pixels in a textured image. interesting points detected by Plessey operator are more suitable for matching because they reserve the image features with less points. The conventional interesting point detection techniques mentioned above were based on the detection of discontinuities in pixel properties. Thus they are not suitable for textured images because texture is characterized by its local features over some neighborhood rather than in pixel gray scale. We further extended the concept of interesting point for general-purpose images. Instead of applying interesting point operators to the original gray-scale image, the detection of interesting points was performed on the feature image converted from the original image. For the case of textured images, we adopted Laws’ texture energy measurement [27] to convert a textured image to a texture energy image for the detection of interesting points. Fig. 2 compares textural image feature pixels represented by edge points, texture energy, interesting points and texture interesting points respectively. Fig. 2(a) is a template image and Fig. 2(b) is a two-texture composite image to be processed, where the background is featured by Brodatz texture raffiad84 [6] and the object in the shape of the coast line of Australia is filled with Brodatz texture pigskind92 [6]. Fig. 2(c) demonstrates edge points detected by Prewitt operator, Fig. 2(d) represents texture energy image using Laws’ R5R5 mask, Fig. 2(e) shows interesting points detected by Moravec operator on the original image shown in Fig. 2(b) and 2(f) presents texture interesting points detected by Plessey operator on the texture energy image shown in Fig. 2(d). Obviously, the texture interesting points best represent the original image with less points. B. Optimal Thresholding via Fuzzy Compactness The identification of interesting points is based on the selection of threshold. When a feature image is regarded as a fuzzy and levels of feature measurements, we set of size extend Pal and Rosenfeld’s fuzzy compactness approach [38] to determine the optimal threshold value by minimizing the measurement of fuzziness. They introduced the index of fuzziness to reflect the average amount of ambiguity (fuzziness) present in an image by measuring the distance between its fuzzy propand the nearest two-level property . A linear index erty of fuzziness is defined as follows: (1) (2) (3) YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME where that denotes the nearest two-level version of 1551 such if otherwise In contrast to the conventional measurement of fuzziness in a gray scale image detailed above, we extended this approach by applying the distance measurement to feature image rather than the measurement of fuzziness image in gray scale. Accordingly, refers to the feature set and represents the certain feature measurement. Instead of processing the gray scale image, we consider the minimization of fuzziness in image feature to select appropriate threshold value for feature point detection. Such an algorithm is summarized as follows. • Step 1: Convert the gray scale image to feature image by its interestingness measurement. The maximum and minand . imum values are , • Step 2: Construct the “feature image” membership where (4) and (5) (6) (7) (8) and bandwidth . • Step 3: Obtain the linear index of fuzziness in image feature image by computing with cross-over point Fig. 3. (a) Histogram of gray-scale distribution and (b) histogram of texture energy distribution. A. Discrete Wavelet Transform (9) where (10) denotes the number of occurrences of the level . and to and choose which • Step 4: Vary from . corresponding to the minimum of The advantage of the use of fuzzy set for optimal thresholding in image feature detection is demonstrated in Fig. 3 by comparing histogram distribution of the selected features. For the given image shown in Fig. 2(b), Fig. 3(a) is the gray scale distribution and Fig. 3(b) is the texture energy distribution. Our test results confirm that the histograms of feature distribution (e.g., textures) provide the base for optimal thresholding for feature detection. Wavelet transforms offer the promise of compact representation and efficient detection of image components that match the wave-shape of the chosen wavelet [31]. Given the set of or, the wavelet series expansion of the thonormal wavelets is band limited continuous function (11) where the coefficients are determined by (12) , and . Simis ilarly, the discrete transform of the sampled function defined as (13) III. A WAVELET-BASED IMAGE FEATURE HIERARCHY A key issue in hierarchical image matching is how to create an image feature pyramid for matching. Unlike most of the existing matching algorithms which detect edge points and perform matching in a resolution pyramid, we apply a wavelet transform to decompose an image into a series of subband images and replace edge points with interesting points for an image feature hierarchy. where the coefficients are (14) for . , 1552 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000 Fig. 4. DWT image decomposition step The concept of the above one dimensional wavelet representation can be extended to two dimensions easily [8]. Considering the case of unitary image transforms, we assume that the two-dimensional (2-D) scaling function is separable, that is, (15) is a one-dimensional (1-D) scaling function. If where is its companion wavelet expressed by (16) where (17) (18) (19) (20) for wavelets , , then the three 2-D basic establish the foundation for a 2-D wavelet transform. Fig. 4 shows the DWT image decomposition step and Fig. 5 illustrates subimages after three-level DWT image decomposition. B. An Image Pyramid via Wavelet Transform Multiresolution of images is a fundamental concept in computer vision [25], [26], [41], [43]. By analyzing an image at different resolutions and increasing the resolution from coarse to fine, we may understand the content of an image in a hierarchical manner [43]. This technique has been applied widely in low-level vision and pattern recognition. An image pyramid is a structure that stores not only the original image, but also a collection of related images derived from the origin. A resolution pyramid will consist of a number of versions of the original image in lower resolutions, where the resolution and density are decreased in a step-by-step format. Each successive member of the sequence is obtained by a filtering followed by a sampling operator. There are three types of pyramids [20]: Gaussian, Laplacian, and morphological. In the Gaussian pyramid, a Gaussian-like kernel is convolved with a member of the pyramid to obtain the successive level member, followed by sampling. In a Laplacian pyramid, each layer is the Laplacian of the corresponding layer of the Gaussian pyramid, followed by sampling. In a morphological pyramid, each layer is obtained by a morphological operation (opening and closing) on the previous level, followed by sampling. Examples are given in [25], where the original image is level zero of the pyramid. To reduce the computational complexity for matching, it is essential to perform matching in a hierarchical structure, starting from the coarse level with less feature points to the fine level with more points for precise matching. Previously we proposed to create an interesting point pyramid by selecting interesting points through a dynamic thresholding scheme guided by the histogram [45]. For a given original image, a so-called interest image is obtained by applying interesting point detector Plessey operator to the image. Each pixel value of this image actually represents the interest degree of the corresponding pixel in the original image. If a certain threshold value is determined, those pixels whose values are below the selected threshold value will be considered as interesting points and set to 1 in the image feature map while those above threshold value will be ruled out and assigned to 0 in the feature map. Depending on each particular application, different threshold values for different levels are determined by the histogram of the interest image from lower value for fewer interesting points to higher value for more points. Fig. 6 shows an example of such an interesting points pyramid. The multiresolution approach to image decomposition using a wavelet orthonormal basis and a pyramidal algorithm to compute it was introduced by Mallat [29] who also considered its applications to data compression, texture discrimination and fractal analysis. Later, Mallat [30] developed an iterative reconstruction algorithm from the zero-crossings. A general multiresolution formulation of wavelet systems is discussed in [11]. In contrast to the traditional image pyramid for hierarchical matching, the image feature pyramid reported in this paper is generated based on the detection of interesting points on each subband images via wavelet transform. For a given original image, a wavelet transform is first applied to decompose the image into a series of subband images. The so-called interest image pyramid is then obtained by applying interesting point detector Plessey operator to the subband images. Fig. 7 shows an example of such an interesting points pyramid. Fig. 7(a)–(c) represents the original image, the subband images and interest image pyramid respectively. It is also observed that the combination of interesting points at the same level of decomposed subband images provides the better feature representation than the direct interesting point detection on the YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME Fig. 5. 1553 Example of three-level image decomposition. Fig. 6. Example of interesting points pyramid. Fig. 7. Image feature pyramid. original images. Such a claim is demonstrated in Fig. 8, where Fig. 8(a) is the original image, Fig. 8(b) contains interesting points detected by Plessey operator on the original image, and Fig. 8(c) displays the combination of interesting points detected by Plessey operator on three subband images after the first level decomposition. 1554 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000 Fig. 8. Comparison of interesting points: original versus subband combination. IV. HIERARCHICAL IMAGE MATCHING The hierarchical guided matching scheme was first introduced by Borgefors [5] in order to reduce the computation cost required to match two images. We extended it by using interesting points rather than edge points in a similar fashion, i.e., an interesting point pyramid is created and the matching starts from the lowest resolution and the results of this match guides the search on the possible area of the higher resolutions. In addition, we applied the wavelet transform to generate the interesting point pyramid. Furthermore, we extended the matching process by using the Hausdorff distance as a measure of similarity. The advantage of the Hausdorff distance is that it allows us to partition the target image into a number of subimages and simultaneously match the template image on these subimages. According to Huttenlocher et al. [22], we are able to use Hausdorff distance algorithm to search for portions, or partially hidden objects. A. Hausdorff Distance The Hausdorff distance is a nonlinear operator which measures the mismatch of the two sets. In other words, such a distance determines the degree of the mismatch between a model and an object by measuring the distance of the point of a model that is farthest from any point of an object and vice versa. Therefore, it can be used for object recognition by comparing two images which are superimposed on one another. The key points regarding this technique are summarized as follows. and Given two finite point sets , the Hausdorff distance between them is defined as (21) where is the distance from set to set expressed as (22) while is the distance from point to set given by (23) Fig. 9. Graphical example of computing Hausdorff distance (H (A; max(h(A; B) = B ); h(B; A))): Obviously, the Hausdorff distance is the maximum of and which measures the degree of mismatch between two sets and . Fig. 9 presents a graphical example of computing the Hausdorff distance. In general, image data are derived from a raster device and represented by grid points as pixels. For a feature detected and can be image, the characteristic function of the set and , respectively, represented by binary arrays th entry in the array is nonzero for in the array where the is nonzero for the corresponding feature pixel in the given and are used image. Therefore, distance arrays the distance to the to specify for each pixel location YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME Fig. 10. Structure of hierarchical image matching. nearest nonzero pixel of or , respectively, where denotes the distance transform of and denotes the distance transform of . Consequently, the Hausdorff distance as a function of translation can be determined by computing and array the pointwise maximum of all the translated in the form of (24) where (25) (26) quer policy and implemented the proposed hierarchical image matching using PVM on a network of workstation clusters. The complicated image matching task is divided into a number of subtasks and those sub-tasks are later reorganized into clusters according to granularity before being mapped on computers for simultaneous operation. Fig. 11 shows the SIMD architecture topology adopted in this paper. Data must be exchanged between cooperating tasks in all parallel processing. The PVM software provides a general and flexible parallel computing environment which enables concurrent executions on loosely coupled networks of processing elements and supports a message-passing model of synchronization. It offers a unified framework within which parallel programs can be developed in an efficient and straightforward way without any specific hardware requirements. It transparently handles all message routing, data conversion, and task scheduling across a network of incompatible computer architectures. Fig. 12 shows a simplified architecture paradigm of the PVM system, and Fig. 13 is a generic model of message-passing adopted by PVM. The success of the parallel implementation using PVM on a cluster of workstations depends on the effective communication pattern between workstations. An image is partitioned into subimages and distributed over nodes on the network. The local processing results such as two-dimensional matrix operation, feature detection and transformation for each subimage are merged over the entire image. In this paper, the master/slave parallel programming paradigm is adopted. The software developed includes a master program and slave programs. The master program controls the data processed by a number of slave programs. It can run on one machine and spawn copies of the slave programs to any number of nodes in the network. The following presents a brief description of the master and slave control programs: Master Program: • • • • • B. A Guided Matching Scheme In order to avoid the blind searching for the best fit between the given patterns, a guided search strategy is essential to reduce computation burden. Our extension of the hierarchical image matching scheme was based on a guided searching algorithm that searches first at the low level, coarse grained images, to the high level, fine grained images. To do this we needed to obtain a Hausdorff distance approximation for each possible window combination of the template and target image at the lowest resolution. Those that returned a Hausdorff distance approximation equal to the lowest Hausdorff distance for those images were investigated at the higher resolution. Fig. 10 shows the structure of the proposed hierarchical matching. V. SPEEDING UP IMAGE MATCHING USING PVM In contrast to the conventional parallel solutions which relied on specialized parallel machine, we adopted a divide-and-con- 1555 • • • • • initialize the system; allocate a memory array for the image data; read an image into the memory array; segment the image into a number of subimages; register in the PVM process by obtaining its task identifier (TID); spawn copies of the slave programs on the slave machines; send control messages to slave machines; send subimage data to slave machines; wait for the slave machines to report back; collect data from slave machines for final output. Slave Program: • slave programs are spawned by the master program at runtime; • initialize the data structure for the processed image segments; • registers in the PVM process and obtains its TID; • wait for data to be sent from the master; • receive the data and perform the specified operation; • inform the master machine when the result data is ready; • send the data to the master; • exit from PVM process. 1556 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000 Fig. 11. Fig. 12. Fig. 13. SIMD architecture topology. Paradigm of PVM architecture model. Generic model of a message-passing multicomputer. In this paper, we adopted a divide and conquer approach and used the above master/slave parallel programming paradigm to implement the proposed hierarchical image matching on a cluster of workstation clusters.ifferent message size. The main steps are outlined as follows. • Step 1) Discrete wavelet transform. 1.1) master node reads in original image; 1.2) master node segments the image data and sends each subimage data to individual slave node; 1.3) each slave node performs wavelet transform under the coordination of the master node; 1.4) master node collects the result from slaves and generate an image pyramid of subband images. • Step 2) Creation of interest image pyramid. 2.1) master node reads in original image; 2.2) master node assigns each subband image to slave node; 2.3) each slave node performs Plessey operator to detect interesting points of the individual subband image; 2.4) master node collects the data from each slave node and combines them together to obtain an interest image pyramid. • Step 3) Distance transform. 3.1) master node assigns one quarter of the partitioned binary image to each corresponding slave node; 3.2) each slave node initializes the subregion of the distance image based on the binary image; 3.3) each slave node applies 3-4 DT mask to perform distance transform and the subregion of the final distance image is created; 3.4) master node collects the processed data from each slave node and combines them to create the distance image. • Step 4) Matching procedure. 4.1) master node assigns one quarter of the partitioned distance image and the interest image of the template to each corresponding slave node; 4.2) each slave node imposes the interest image of the template over the distance image and calculate the root mean square average for optimization; 4.3) each slave node passes a number of possible areas to the master node for further comparison. • Step 5) Master node outputs the final result of matching. YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME 1557 TABLE I TIME TO PACK AND SEND MESSAGES WITH DIFFERENT SIZES VI. PERFORMANCE EVALUATION To evaluate the performance gain, a system was implemented on a group of networked workstations, where PVM was used to provide a parallel execution environment. Based on the developed system, a number of experiments were carried out to measure the effectiveness of using the hierarchical approach in matching, and to measure the speedup of using parallel guided matching against using sequential matching. The impact of communication overheads in relation to message size is also investigated. The program codes were written in ANSI standard C and executed on a network of SPARC ELC stations under the UNIX operating system. Fig. 14. Average time to pack and send a message versus message size. (The discrete wavelet transforms of 512 512 images.) 2 A. PVM Overheads versus Message Size In general, the inter-process communication overheads will have negative impacts on system efficiency. Our concern is how to maximize the gain from parallel processing in terms of the data size, the bandwidth of the communication channel, and the complexity of algorithms. The time for the PVM process to pack, transmit and unpack data from the master to any slave is referred to as the PVM overheads, which is determined by the PVM process and various communication factors such as node link length, composition, topology and so on. It is desirable to determine the optimal data size with minimum overheads effects. Table I lists different time required for data packing and sending in terms of different sizes for the discrete wavelet transimages. Fig. 14 depicts the average time to forms of pack and send a message versus the message size and Fig. 15 depicts the average number of message bytes packed and sent per microsecond with respect to d B. Parallel Image Matching In our PVM-based implementation with a master/slave model, the tasks for the master node include reading and segmenting image data, sending sub-image data to each slaves, coordinating the relevant results produced by slaves, collecting the result from slaves for the final output. Table II shows different execution time with different number of slaves for image based on pyramid the wavelet transform on algorithm. In order to show the advantages and potentials of PVM for parallel image processing in a distributed computing environment, the performance of both sequential and parallel implementation of the proposed image matching algorithm based on detecting interesting points by Moravec operator and Plessey Fig. 15. Average speed for message packing and sending. (The discrete wavelet transforms of 512 512 images.) 2 TABLE II WAVELET TRANSFORM: THE EXECUTION TIME THE NUMBER OF SLAVES VERSUS operator is compared. In addition, the parallel performance improvement is measured by the following ratio [23]: (27) where refers to the sequential execution time and refers to the parallel execution time. Table III lists the performance evaluation in terms of the average execution time of two operations for the detection of interesting points on different images with to . The sequenvarious sizes ranging from tial processing is performed on a single SPARC ELC station while the parallel implementation is on a four-node structure using SPARC ELC stations. These experimental results indicate that in general the parallel implementation is faster than the sequential one even for a simple operation on a small size image image). The advantage (e.g. Moravec operation on a of parallelism becomes more obvious when the test image size grows larger and the adopted algorithm gets more complicated. 1558 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 9, SEPTEMBER 2000 TABLE III THE PERFORMANCE COMPARISON: SEQUENTIAL VERSUS PARALLEL TABLE IV HIERARCHICAL IMAGE MATCHING SCHEME TABLE V MATCHING A ROTATED TEMPLATE IMAGE in image matching is to find an efficient and effective approach to characterize image features and measure the degree of similarity between two images that are superimposed on one another. We proposed a wavelet-based hierarchical scheme to facilitate coarse-to-fine image matching to achieve effectiveness and efficiency. Unlike the traditional matching process in which edge pixels are considered as essential image feature points in distance transform, our hierarchical image matching was performed on an interesting points image pyramid, where a wavelet transform was applied to decompose an image into different levels of subband images and interesting points were detected at each level based on optimal thresholding via fuzzy compactness. In addition, the searching for the best match was guided from the coarse-level in the pyramid to the fine-level throughout the hierarchical structure. It is noted that parallelism provides more powerful computing ability for image processing systems where simple operations on large set of data is required. The advanced technology of computer network has made very high-speed network available. Consequently, such a distributed computer system has the potential to parallelize some image processing tasks for high speed. The innovation of our approach lies in the parallel implementation of the hierarchical matching scheme on a low cost heterogeneous PVM network on the basis of a divide-and-conquer policy. The experimental results confirm the effectiveness of the proposed method. There is still some work which requires further investigation in order to reduce the time consumption to a level of real-time. One of the possibilities is to use distributed shared memory (DSM) systems for image matching. DSM systems provide a similar interface as a multiprocessor, but with distributed memory. This work intends to study the suitability of DSM for image matching. It is because image matching has very low rate of data modification, hence, it should be suitable to a DSM system. In addition, a task scheduling facility is under concern which intends to appropriately distribute tasks among computers and aims to minimize the execution time. ACKNOWLEDGMENT The authors would like to thank an anonymous reviewer for the constructive comments made. The authors would like to thank M. Hitchings and S. Kundu for the assistance of some program codes. Fig. 16. Object recognition via image matching. Tables IV and V compare the execution time for matching. Fig. 16 shows an example of object identification and localization using our hierarchical matching scheme. Fig. 16(a) is target image with certain object to be identified and a Fig. 16(b) shows the matching result of our hierarchical scheme, which returns a match position. VII. CONCLUSION Image matching plays an important role in pattern recognition, image analysis and computer vision. The central problem REFERENCES [1] S. T. Barnard and W. B. Thompson, “Disparity analysis of images,” IEEE Trans. Pattern. Anal. Machine Intell., vol. PAMI-2, pp. 333–340, 1980. [2] H. G. Barrow, J. M. Tenenbaum, R. C. Bolles, and H. C. Wolf, “Parametric correspondence and chamfer matching: Two new techniques for image matching,” in Proc. 5th Int. Joint Conf. Artificial Intelligence, Cambridge, MA, 1977, pp. 659–663. [3] T. O. Binford, “Survey of model-based image analysis,” Int. J. Robot. Res., vol. 1, pp. 18–64, 1982. [4] G. Borgefors, “Distance transforms in digital images,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-34, pp. 344–371, 1986. [5] , “Hierarchical chamfer matching: A parametric edge matching algorithm,” IEEE Trans. Pattern Anal. Machine Intell., vol. 10, pp. 849–865, 1988. [6] P. Brodatz, Textures: A Photographic Album for Artists and Designers. New York: Dover, 1966. YOU AND BHATTACHARYA: WAVELET-BASED COARSE-TO-FINE IMAGE MATCHING SCHEME [7] K. R. Castleman, Digital Image Processing. Englewood Cliffs, NJ: Prentice-Hall, 1996. [8] C. K. Chui, An Introduction to Wavelets. New York: Academic, 1992. [9] J. R. Corbib, The Art of Distributed Applications. Berlin, Germany: Springer-Verlag, 1991. [10] P. E. Danielsson, “Euclidean distance mapping,” Comput. Graph. Image Process., vol. 14, pp. 227–248, 1980. [11] I. Daubechies, Ten Lectures on Wavelets. Philadephia, PA: SIAM, 1992. [12] W. Forstner and A. Pertl, “Photogrammetric standard methods and digital image matching techniques for high precision surface measurements,” in Pattern Recognition in Practice II, E. S. Gelsema and I. N. Kanal, Eds. Amsterdam, The Netherlands: Elsevier, 1986, pp. 57–72. [13] MPI Forum, “MPI: A message-passing interface standard,” Int. J. Supercomput. Applicat., vol. 8, pp. 165–416, 1994. [14] P. Fua and A. J. Hanson, “Resegmentation using generic shape: Locating general cultural objects,” Pattern Recognit. Lett., vol. 5, no. 3, pp. 243–252, 1987. [15] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam, PVM: Parallel Virtual Machine, A User’s Guide and Tutorial for Network Computing. Cambridge, MA: MIT Press, 1994. [16] D. B. Gennery, “A stereo vision system for autonomous vehicle,” in Proc. 5th Int. Joint Conf. Artificial Intelligence, Cambridge, MA, 1977. [17] D. Gerogiannis, “Load balancing requirements in parallel implementations of image feature extraction tasks,” IEEE Trans. Parallel Distrib. Syst., vol. 4, no. 9, pp. 994–1013. [18] M. Hahn, “Automatic measurement of digital terrain models by means of image matching techniques,” in Proc. 42nd Photogrammetric Week: Stuttgart Univ., 1989, vol. 13, pp. 141–151. [19] U. V. Helava, “Digital correlation in photogrammetric instruments,” Int. Arch. Photogram. Remote Sensing, vol. 20, no. 2, 1976. [20] R. Haralick and L. G. Shapiro, Computer and Robot Vision. Reading, MA: Addision-Wesley, 1993, vol. 2, ch. 16. [21] “Lehrstuhl fur Photogrammetrie und Fernerkundung,” Techniche University, Munich, Germany, http://dgrwww.epfl.ch/PHOT/publicat/wks96/Art=3=1.html. [22] D. P. Huttenlocher, G. A. Klanderman, and W. J. Rucklidge, “Comparing images using the Hausdorff distance,” IEEE Trans. Pattern Anal. Machine Intell., vol. 15, pp. 850–863, 1993. [23] K. Hwang, Advanced Computer Architecture—Parallelism, Scalability, and Programmability. New York: McGraw-Hill, 1993. [24] A. K. Jain, “A sinusoidal family of unitary transforms,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-1, pp. 356–365, 1979. [25] J. Jolion and A. Rosenfeld, A Pyramid Framework for Early Vision: Multiresolutional Computer Vision. Norwell, MA: Kluwer, 1994. [26] J. Koenderink, “The structure of images,” in Biological Cybernetics. New York: Springer-Verlag, 1984. [27] K. I. Laws, “Textured image segmentation,” Ph.D dissertation, Univ. Southern. Calif., Los Angeles, 1980. [28] J. Magarey and N. Kingsbury, “Motion estimation using a complexvalued wavelet transform,” IEEE Trans. Signal Processing, vol. 46, pp. 1069–1084, Apr. 1998. [29] S. G. Mallat, “A theory for multiresolution signal decomposition: The wavelet representation,” IEEE Trans. Pattern Anal. Machine Intell., vol. 11, pp. 676–693, 1989. [30] S. Mallat, “Zero-crossing of a wavelet transform,” IEEE Trans. Inform. Theory, vol. 37, no. 4, pp. 1019–1033, 1991. , A Wavelet Tour of Signal Processing. New York, CA: Academic, [31] 1998. [32] M. Maresca, M. A. Lavin, and H. W. Li, “Parallel architectures for vision,” Proc. IEEE, vol. 76, no. 8, pp. 970–981, 1988. [33] D. Marr and E. Hildreth, “Theory of edge detection,” Proc. R. Soc. Lond. B, vol. 207, pp. 187–217, 1980. [34] H. P. Moravec, “Toward automatic visual obstacle avoidance,” in Proc. 5th Int. Joint Conf. Artificial Intelligence, Cambridge, MA, 1977, p. 584. [35] R. Nevatia and K. E. Price, “Locating structures in aerial images,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-4, pp. 476–484, 1982. [36] J. A. Noble, “Finding corners,” Image Comput., vol. 6, no. 2, pp. 121–128, 1988. [37] D. W. Paglieroni, “Distance transforms: Properties and machine vision applications,” Comput. Vis. Graph. Image Process., vol. 54, no. 1, pp. 56–74, 1992. [38] S. K. Pal and A. Rosenfeld, “Image enhancement and thresholding by optimization of fuzzy compactness,” Pattern Recognit. Lett., vol. 7, pp. 77–86, 1988. 1559 [39] H. Pan, “Uniform full-information image matching using complex wavelets,” Germ. J. Photogram. Remote Sensing, vol. 64, pp. 93–100, 1996. [40] A. Rosenfeld, “The fuzzy geometry of image subsets,” Pattern Recognit. Lett., vol. 2, pp. 311–317, 1984. [41] J.-L. Starck, F. Murtagh, and A. Bijaoui, Image Processing and Data Analysis: The Multiscale Approach. New York: Cambridge Univ. Press, 1998. [42] R. Voelpel, Parallel Computing: Fundamentals, Applications and New Directions. Amsterdam, The Netherlands: Elsevier, 1998. [43] A. Witkin, D. Terzopoulos, and M. Kass, “Signal matching through scale space,” Int. J. Comput. Vis., vol. 1, pp. 133–144, 1987. [44] J. Wu, Distributed System Design. Boca Raton, FL: CRC, 1999. [45] J. You, E. Pissaloux, J. L. Hellec, and P. Bonnin, “A guided image matching approach using Hausdorff distance with interesting points detection,” in Proc. IEEE Int. Conf. Image Processing (ICIP’94), Austin, TX, 1994, pp. 968–972. [46] J. You, E. Pissaloux, W. P. Zhu, and H. A. Cohen, “Efficient image matching: A hierarchical Chamfer matching scheme on a distributed system,” Real-Time Imag., vol. 1, pp. 245–259, 1995. [47] J. You, P. Bhattacharya, and S. Hungenahally, “Real-time object recognition: Hierarchical image matching in a parallel virtual machine environment,” in Proc. 14th Int. Conf. Pattern Recognition (ICPR’98), vol. 1, Brisbane, Australia, 1998, pp. 275–277. Jane You (M’94) received the B. Eng. degree in electronic engineering from Xi’an Jiaotong University, China, in 1986 and the Ph.D. degree in computer science from La Trobe University, Australia, in 1992. From 1993 to 1995, she was a Lecturer with the School of Computer and Information Science, University of South Australia. In 1993, she was awarded a French Foreign Ministry International Fellowship and was a Team Member of the Real-Time Object Recognition Project at Université Paris XI, Paris, France. She joined the School of Computing and Information Technology, Griffith University, Queensland, Australia, in 1996. Currently, she is a Senior Lecturer at Griffith University and an Academic Staff Member at the Hong Kong Polytechnic University. Her research interest include visual information retrieval, image processing, pattern recognition, multimedia systems, biometrics computing, and data mining. Prabir Bhattacharya (SM’92) received the B.A. and M.A. degrees from the University of Delhi, India, and the D.Phil. degree in mathematics in 1979 from the University of Oxford, Oxford, U.K., specializing in group theory. He He is a Full Professor with the Department of Computer Science and Engineering, University of Nebraska, Lincoln, which he joined in 1986. In August 1999, he joined (on a leave of absence from the University of Nebraska) the Panasonic Information and Networking Technologies Laboratory, Princeton, NJ, as a Principal Scientist and Project Leader. He is also currently a Visiting Full Professor at the Center for Automation Research, University of Maryland, College Park. He has authored/co-authored 74 journal papers, 45 conference papers, and also co-edited a book on vision geometry. His research has been supported by many external agencies, such as AFOSR, NASA, BMDO, and NIH. He has been on the editorial board of Pattern Recognition since 1994. During 1996–1998, he was on the editorial board of the IEEE Computer Society Press. He is a Fellow of the Institute of Mathematics and Its Applications, U.K. He was a Distinguished Visitor of the IEEE Computer Society during 1996–1999, and a National Lecturer of the Association for Computing Machinery (ACM) during 1996–1997. He was the Chairman of the Nebraska Chapter of the IEEE Computer Society during 1995–1997. He received an award from the IEEE Nebraska Chapter for the Highest Involvement in 1997. He has served in the program committees of many conferences, including the IEEE Image Processing Conferences during 1997–1999, and the IEEE Computer Vision and Pattern Recognition Conference, 1999. He is listed in Who’s Who in Science and Engineering, and Who’s Who in Multimedia and Communications.