DOC - Imaging, Robotics, and Intelligent Systems Laboratory

Dr. Claire L. McCullough*
326 Grote Hall, Dept. 2302
University of Tennessee at Chattanooga
615 McCallie Ave.
Chattanooga, TN 37403-2598
In both military and civilian applications,
increasing interest is being shown in fusing
outputs from disparate sensors for increased
situational awareness. This interest is based on
the observation that by combining information
from different types of sensors, object
identification (or pattern recognition) can be
improved in robustness and reliability, due to the
complementary nature of the sensor data
included. The work discussed here successfully
demonstrates fusion of both data and features of
vision and infrared images. This work can be
useful to such disparate applications as search
and rescue, military operations, firefighting, and
improved night vision for the automotive
In previous work, several fusion methods
have been demonstrated for combining thermal
and vision images into a single image
emphasizing the most salient features of the
surrounding environment.
The methods
demonstrated have used chromatic image display,
with the relative portions of these two data signals
controlling the color of the image and place it
somewhere between red and violet in the light
spectrum [1], fusion based on differences
between the two images, and fusion based on
pixel information content [2]. By using different
fusion functions and by comparing the results
with a known set of images, one can select a
function that can produce optimal fusion for a
given application. Both fuzzy and crisp fusion
functions have been used, with each proving
useful for different fusion tasks.
This paper contains a continuation of this
work, first in the comparison of additional data
level fusion techniques, including a fuzzy
“interest” based approach.
This technique
examines the information content of the two
images by measuring the signal variations over
segments of two images, generating a fuzzy
function based on those measurements, and
selectively combining the segments that contain
the most information, in both crisp and
proportional fashions. In addition, feature level
techniques are compared for efficacy with the
data level techniques, by extracting salient
features (e.g., shapes, lines, and edges) from the
fused images, and comparing this with the same
features extracted from the individual images and
then fused using the previously discussed
techniques. The fuzzy/”interest” technique which
is discussed in the paper has been demonstrated to
successfully combine visual and infrared images,
images with different types of noise applied,
“edge” images, and Synthetic Aperture Radar,
infrared, and visual images (fusing three images
into one). While much work remains to be done,
the new method appears to compare favorably
with existing fusion techniques, and to be
applicable to fusion of up to three pixel registered
intensity images from a variety of disparate
sensor types.
Multi-sensor fusion can be approached in
many different ways, which fall into three basic
*Formerly with the U.S. Army Space and Missile Defense Command, PO Box 1500, Huntsville, AL,
35807-3801, for whom this work was performed.
categories. In the first approach, data from all sensors
is fused and then feature extraction and/or
identification is carried out with fused data from all
sensors (data level fusion). In the second, the relevant
features are first abstracted from the data and then
fused, to produce either a fused feature set, which is
then used in classification, or a detection/decision
based on the fused feature set (feature level fusion). In
the third category, decisions/detections are based on
the outputs of individual sensors, and the decisions are
then fused (decision level fusion). The fusion methods
discussed here are data level fusion, although these
techniques can also be used with the second approach
if shapes and edges are extracted prior to fusion [2].
In fusing two images from disparate
sensors, it is obviously desirable to include, or
more heavily weight, pixels which are more
“interesting” on some level than those which are
not emphasized. “Interest” is defined in this
context as relevance or importance to the scene in
the images to be fused. While the definition of
“interest” could vary widely for different
applications, two measures which have been
hypothesized for “interest” are
between a pixel and each of the eight pixels
which surrounds it, as shown in Figure 1, and
those differences are summed to give a single
value ini for each pixel. As the pixel of concern
is in the center of the square, in order to deal with
the edges of the image, while producing a fused
image of the same size as the originals, the edge
how different the pixel is from the pixels
around it
the relative intensity of the pixel’s color, or
gray level, value.
Both of these are measured for every pixel of
each image to be fused, and the results
incorporated into a rule-based fusion system.
Fusion is performed in both a “crisp” fashion, in
which the more “interesting” pixel of the pixels in
the same positions of the images to be fused is
inserted into the fused image, and
“proportional” fashion, where values for all the
the pixels in the same positions are combined,
with the weighting of each being proportional to
the interest measures.
For the first measure of “interest,” an
absolute intensity difference is calculated
Figure 1. The pixel of concern is in the center
of the square, with absolute intensity
differences calculated to each of the
surrounding eight pixels
rows and columns are duplicated. That is, the
“interest” ini is calculated for each original pixel
on a matrix two rows and two columns larger
than the original, in which the edge pixels are
duplicates of those in the edge rows of the
original matrix, for each sensor type. This gives a
matrix of values for ini with the same dimensions
as the original images.
The equation for the fuzzy function of
“interest” is defined as
fuzini  exp
 in  mean
  
 spread i
where ini is the “interest” of a given pixel relative
to sensor i, meani is the center of the fuzzy
function for sensor i, spreadi is a parameter
governing the width of the function, and poweri is
the power to which the value is raised. Choosing
this function for the fuzzy interest value gives a
great deal of flexibility to the user, as by changing
values for power and spread, new families of
functions may be generated with very little effort
by the user. This is illustrated by the graph in
Figure 2, which shows the different shapes which
may be generated by changing the power in the
function. Upon execution of the program, the
user is allowed to either accept default values for
meani, spreadi, and poweri, or to enter his own for
each image to be fused.
The crisp version of combination is given
by the rule set
 If
fuzin1 (a, b)  fuzin2 (a, b)  ntens1 (a, b)  ntens2 (a, b)
fuz _ crsp (a, b) 
(value1 (a, b)  value2 (a, b))
(ntens1 (a, b)  fuzin1 (a, b))  (ntens2 (a, b)  fuzin2 (a, b))
Else, if
fuz _ crsp (a, b)  value1 (a, b)
fuz _ crsp(a, b)  value2 (a, b)
Figure 2. An illustration of the fuzzy function
with differing values for Power
The “intensity” measure for each pixel is
ntensi  2  pixelvalue  .5
which gives a value of “1” for pixels which are
either pure white or pure black, and a value
between 0 and 1 for shades of gray.
Other crisp rules of combination were also
evaluated for image combination, but this was
one of the more visually appealing versions.
All work described here was done on a
300 Mhz Pentium II with 98M of RAM, equipped
with MATLAB. The software accepts input
images in the .gif format, although it can be easily
adapted to accept other image types as well. The
images on which the algorithms will be illustrated
are found in Figure 3, a vision image, and Figure
4, an infrared image. The image formed by
fusion using the proportional technique is found
Once these measures are calculated, the
proportional version of the fused image is
generated for each pixel using the rule of
fuz _ prop (a, b) 
 2
 fuzin n (a, b)  valuen (a, b)  ntensn (a, b)  valuen (a, b)
 n 1
 2
 fuzin m (a, b)  ntensm (a, b)
 m 1
where valuei is the intensity of pixel (a,b) in
image i.
Figure 3. Vision Image
Figure 4. Infrared Image
Figure 7. Edges extracted from vision image
Figure 5. Image fused using the proportional
Figure 6. Image generated using crisp fusion
Figure 8. Edges extracted from infrared image
Figure 9. Edges fused using proportional method
in Figure 5, and the crisp image in Figure 6.
Note that both the truck, which is clearly visible
in the vision image, and the helicopter, which is
visible primarily in the infrared image, are visible
in the fused images.
In many defense applications, either
before or after fusion with other information,
features are extracted from an image, which can
then be used in pattern recognition, automatic
target classification, etc. For a scene such as
those illustrated in this paper, shapes, lines, and
edges would be among the features ordinarily
extracted. As part of the work described here,
edges were extracted from both IR and vision
images, with the “edge” images then fused. The
resulting image was compared to the “edge”
image obtained from first fusing the images using
the information-based fusion methods described
in Section 2, and then extracting edges from the
fused image. The reasons for this were twofold:
Figure 10. Edges fused using the crisp method
to see if the "interest" fusion approach was
successful as a feature-level fuser, as well as
a data level fusion algorithm;
to see which approach, in this example,
seemed to retain more of the relevant shape
information from the thermal and vision
The “edge” images from the thermal and vision
images are shown in Figures 7 and 8,
All edges shown here were
extracted using the Adobe Photoshop image
processing package. The images generated by
fusing these edges is found in Figure 9 for the
proportional fusion, and Figure 10 for crisp
fusion. The crisp image has greater contrast, but
the proportional method seems to retain more
detail in the image.
Figure 11. Edges of image fused with proportional
The feature images generated by
extracting edges from the fused images (i.e., the
images in Figures 5 and 6) are shown in Figures
11 and 12. For the examples used here, it appears
that in the images in which edges were extracted
from previously fused images, additional detail is
retained, as compared to the image in which
Figure 12. Edges of crisp fused image
edges were extracted and then fused. However,
the image also retains additional artifacts which
might be mistakenly categorized as objects of
interest. While no general conclusions can be
drawn from this, it would suggest that in some
cases, fusing before shapes are extracted may be
beneficial. Obviously, many more images need
to be examined in order to develop a general
fusion/extraction strategy.
An “interest” based fusion method for
combining the thermal and vision images into a
single image is described. Fusion is performed by
evaluating fuzzy functions of differences in
intensity among adjoining pixels, as well as
absolute pixel intensity.
The algorithms
described here were demonstrated to be
efficacious for fusion of both raw imagery, and
extracted shapes and edges. Future work will
include examination of additional images, as well
as additional fusion approaches.
The unprocessed vision and infrared
images used in this paper were provided by the
TNO Physics and Electronics Lab, and were
produced by Defence Research Establishment
Valcartier. The help of both organizations and
permission to include the images is gratefully
[1] M. E. Ulug and C. L. McCullough, "Fusion of
Thermal and Vision Images," presented at the
1998 Conference on Artificial Neural Networks
in Engineering (ANNIE98), St. Louis, MO,
November 1998. Also appeared in Intelligent
Engineering Systems Through Artificial Neural
Networks, Vol. 8, ASME Press, New York, 1998.
[2] M. E. Ulug and C. L. McCullough, “Feature
and Data Level Fusion of Infrared and Visual
Images,” presented at the SPIE Aerosense
Conference, Orlando, FL, April 1999.