DATA LEVEL FUSION OF IMAGES FROM DISPARATE SENSORS Dr. Claire L. McCullough* 326 Grote Hall, Dept. 2302 University of Tennessee at Chattanooga 615 McCallie Ave. Chattanooga, TN 37403-2598 ABSTRACT In both military and civilian applications, increasing interest is being shown in fusing outputs from disparate sensors for increased situational awareness. This interest is based on the observation that by combining information from different types of sensors, object identification (or pattern recognition) can be improved in robustness and reliability, due to the complementary nature of the sensor data included. The work discussed here successfully demonstrates fusion of both data and features of vision and infrared images. This work can be useful to such disparate applications as search and rescue, military operations, firefighting, and improved night vision for the automotive industry. 1.0 INTRODUCTION In previous work, several fusion methods have been demonstrated for combining thermal and vision images into a single image emphasizing the most salient features of the surrounding environment. The methods demonstrated have used chromatic image display, with the relative portions of these two data signals controlling the color of the image and place it somewhere between red and violet in the light spectrum [1], fusion based on differences between the two images, and fusion based on pixel information content [2]. By using different fusion functions and by comparing the results with a known set of images, one can select a function that can produce optimal fusion for a given application. Both fuzzy and crisp fusion functions have been used, with each proving useful for different fusion tasks. This paper contains a continuation of this work, first in the comparison of additional data level fusion techniques, including a fuzzy “interest” based approach. This technique examines the information content of the two images by measuring the signal variations over segments of two images, generating a fuzzy function based on those measurements, and selectively combining the segments that contain the most information, in both crisp and proportional fashions. In addition, feature level techniques are compared for efficacy with the data level techniques, by extracting salient features (e.g., shapes, lines, and edges) from the fused images, and comparing this with the same features extracted from the individual images and then fused using the previously discussed techniques. The fuzzy/”interest” technique which is discussed in the paper has been demonstrated to successfully combine visual and infrared images, images with different types of noise applied, “edge” images, and Synthetic Aperture Radar, infrared, and visual images (fusing three images into one). While much work remains to be done, the new method appears to compare favorably with existing fusion techniques, and to be applicable to fusion of up to three pixel registered intensity images from a variety of disparate sensor types. 2.0 DATA FUSION Multi-sensor fusion can be approached in many different ways, which fall into three basic *Formerly with the U.S. Army Space and Missile Defense Command, PO Box 1500, Huntsville, AL, 35807-3801, for whom this work was performed. categories. In the first approach, data from all sensors is fused and then feature extraction and/or identification is carried out with fused data from all sensors (data level fusion). In the second, the relevant features are first abstracted from the data and then fused, to produce either a fused feature set, which is then used in classification, or a detection/decision based on the fused feature set (feature level fusion). In the third category, decisions/detections are based on the outputs of individual sensors, and the decisions are then fused (decision level fusion). The fusion methods discussed here are data level fusion, although these techniques can also be used with the second approach if shapes and edges are extracted prior to fusion [2]. 2.1 “INTEREST” OF A PIXEL In fusing two images from disparate sensors, it is obviously desirable to include, or more heavily weight, pixels which are more “interesting” on some level than those which are not emphasized. “Interest” is defined in this context as relevance or importance to the scene in the images to be fused. While the definition of “interest” could vary widely for different applications, two measures which have been hypothesized for “interest” are between a pixel and each of the eight pixels which surrounds it, as shown in Figure 1, and those differences are summed to give a single value ini for each pixel. As the pixel of concern is in the center of the square, in order to deal with the edges of the image, while producing a fused image of the same size as the originals, the edge how different the pixel is from the pixels around it the relative intensity of the pixel’s color, or gray level, value. Both of these are measured for every pixel of each image to be fused, and the results incorporated into a rule-based fusion system. Fusion is performed in both a “crisp” fashion, in which the more “interesting” pixel of the pixels in the same positions of the images to be fused is inserted into the fused image, and a “proportional” fashion, where values for all the the pixels in the same positions are combined, with the weighting of each being proportional to the interest measures. 2.2 “INTEREST” BASED FUSION For the first measure of “interest,” an absolute intensity difference is calculated Figure 1. The pixel of concern is in the center of the square, with absolute intensity differences calculated to each of the surrounding eight pixels rows and columns are duplicated. That is, the “interest” ini is calculated for each original pixel on a matrix two rows and two columns larger than the original, in which the edge pixels are duplicates of those in the edge rows of the original matrix, for each sensor type. This gives a matrix of values for ini with the same dimensions as the original images. The equation for the fuzzy function of “interest” is defined as fuzini exp in mean i i spread i poweri where ini is the “interest” of a given pixel relative to sensor i, meani is the center of the fuzzy function for sensor i, spreadi is a parameter governing the width of the function, and poweri is the power to which the value is raised. Choosing this function for the fuzzy interest value gives a great deal of flexibility to the user, as by changing values for power and spread, new families of functions may be generated with very little effort by the user. This is illustrated by the graph in Figure 2, which shows the different shapes which may be generated by changing the power in the function. Upon execution of the program, the user is allowed to either accept default values for meani, spreadi, and poweri, or to enter his own for each image to be fused. The crisp version of combination is given by the rule set If fuzin1 (a, b) fuzin2 (a, b) ntens1 (a, b) ntens2 (a, b) then fuz _ crsp (a, b) 1 0.9 (value1 (a, b) value2 (a, b)) 2 0.8 0.7 (ntens1 (a, b) fuzin1 (a, b)) (ntens2 (a, b) fuzin2 (a, b)) Else, if 0.6 then fuz _ crsp (a, b) value1 (a, b) 0.5 0.4 0.3 Power=.5 fuz _ crsp(a, b) value2 (a, b) 0.2 0.1 0 Power=5 0 2 4 Power=2 6 8 Else 10 Figure 2. An illustration of the fuzzy function with differing values for Power The “intensity” measure for each pixel is simply ntensi 2 pixelvalue .5 which gives a value of “1” for pixels which are either pure white or pure black, and a value between 0 and 1 for shades of gray. Other crisp rules of combination were also evaluated for image combination, but this was one of the more visually appealing versions. 2.3 EXPERIMENTAL RESULTS All work described here was done on a 300 Mhz Pentium II with 98M of RAM, equipped with MATLAB. The software accepts input images in the .gif format, although it can be easily adapted to accept other image types as well. The images on which the algorithms will be illustrated are found in Figure 3, a vision image, and Figure 4, an infrared image. The image formed by fusion using the proportional technique is found Once these measures are calculated, the proportional version of the fused image is generated for each pixel using the rule of combination fuz _ prop (a, b) 2 fuzin n (a, b) valuen (a, b) ntensn (a, b) valuen (a, b) n 1 2 fuzin m (a, b) ntensm (a, b) m 1 where valuei is the intensity of pixel (a,b) in image i. Figure 3. Vision Image Figure 4. Infrared Image Figure 7. Edges extracted from vision image Figure 5. Image fused using the proportional technique Figure 6. Image generated using crisp fusion technique Figure 8. Edges extracted from infrared image Figure 9. Edges fused using proportional method in Figure 5, and the crisp image in Figure 6. Note that both the truck, which is clearly visible in the vision image, and the helicopter, which is visible primarily in the infrared image, are visible in the fused images. 3.0 FEATURE LEVEL FUSION In many defense applications, either before or after fusion with other information, features are extracted from an image, which can then be used in pattern recognition, automatic target classification, etc. For a scene such as those illustrated in this paper, shapes, lines, and edges would be among the features ordinarily extracted. As part of the work described here, edges were extracted from both IR and vision images, with the “edge” images then fused. The resulting image was compared to the “edge” image obtained from first fusing the images using the information-based fusion methods described in Section 2, and then extracting edges from the fused image. The reasons for this were twofold: Figure 10. Edges fused using the crisp method to see if the "interest" fusion approach was successful as a feature-level fuser, as well as a data level fusion algorithm; to see which approach, in this example, seemed to retain more of the relevant shape information from the thermal and vision images. The “edge” images from the thermal and vision images are shown in Figures 7 and 8, respectively. All edges shown here were extracted using the Adobe Photoshop image processing package. The images generated by fusing these edges is found in Figure 9 for the proportional fusion, and Figure 10 for crisp fusion. The crisp image has greater contrast, but the proportional method seems to retain more detail in the image. Figure 11. Edges of image fused with proportional method The feature images generated by extracting edges from the fused images (i.e., the images in Figures 5 and 6) are shown in Figures 11 and 12. For the examples used here, it appears that in the images in which edges were extracted from previously fused images, additional detail is retained, as compared to the image in which Figure 12. Edges of crisp fused image edges were extracted and then fused. However, the image also retains additional artifacts which might be mistakenly categorized as objects of interest. While no general conclusions can be drawn from this, it would suggest that in some cases, fusing before shapes are extracted may be beneficial. Obviously, many more images need to be examined in order to develop a general fusion/extraction strategy. 4.0 CONCLUSIONS An “interest” based fusion method for combining the thermal and vision images into a single image is described. Fusion is performed by evaluating fuzzy functions of differences in intensity among adjoining pixels, as well as absolute pixel intensity. The algorithms described here were demonstrated to be efficacious for fusion of both raw imagery, and extracted shapes and edges. Future work will include examination of additional images, as well as additional fusion approaches. 5.0 ACKNOWLEDGEMENTS The unprocessed vision and infrared images used in this paper were provided by the TNO Physics and Electronics Lab, and were produced by Defence Research Establishment Valcartier. The help of both organizations and permission to include the images is gratefully acknowledged. REFERENCES [1] M. E. Ulug and C. L. McCullough, "Fusion of Thermal and Vision Images," presented at the 1998 Conference on Artificial Neural Networks in Engineering (ANNIE98), St. Louis, MO, November 1998. Also appeared in Intelligent Engineering Systems Through Artificial Neural Networks, Vol. 8, ASME Press, New York, 1998. [2] M. E. Ulug and C. L. McCullough, “Feature and Data Level Fusion of Infrared and Visual Images,” presented at the SPIE Aerosense Conference, Orlando, FL, April 1999.