Mine Detection Techniques Using Multiple Sensors

advertisement
ECE 501 the Project in lieu of Thesis
Mine Detection Techniques Using Multiple Sensors
Cheolha Pedro Lee
Electrical and Computer Engineering
The University of Tennessee at Knoxville
lee@iristown.engr.utk.edu
Acknowledgement
I would like to thank Dr. Mongi Abidi for his support for this work and Dr. Joonki Paik for his advice.
I also would like to thank my committee members, Dr. Hairong Qi and Dr. Michael Roberts for evaluating this
work.
I would like to thank Mr. Sunho Lee and Mr. Jeongho Shin for their previous research about this topic.
I appreciate Dr. Luc van Kempen and Dr. Bart Scheers for providing their papers individually.
I also appreciate Miss Lydia Fanning for her help to edit this report.
Especially, I want to thank my wife Soomi for her support and patience during the last few months.
Abstract
This work has four main objects, to survey previous research about mine detection, to collect signature data for
future research, to experiment with the surveyed methods and collected data, and to propose an idea for an
advanced method.
Mainly this report is about the surveyed image processing methods and their implementation. Also, a method
to find mine targets from a set of candidates is proposed.
Sensor technology is introduced prior to the image processing sections to help understand the character of the
data signal.
The intention of the solution is to analyze a set of infrared data sequences, which is called dynamic
thermography. The necessary image processing methods are introduced as four topics, filtering, feature
extraction, gray-scale morphology, and segmentation.
In the detailed description, the Karhunen-Loeve transformation is introduced as a feature extraction topic.
Several operators of gray-scale morphology are introduced for the purpose of filtering, gradient, and
segmentation. The alternating sequential filter is especially used for the filtering purpose. Finally, the markerbased watershed algorithm is introduced for the segmentation method. These methods are tested with actual
mine data.
Table of Content
Mine Detection Techniques Using Multiple Sensors.............................................................................................. 1
Chapter 1.
Introduction ..................................................................................................................................... 1
1.1
Mine Facts and Motivation ..................................................................................................................... 1
1.2
Structure of This Project Report.............................................................................................................. 3
Chapter 2.
Classification of Mines.................................................................................................................... 4
Chapter 3.
Sensor Technology .......................................................................................................................... 8
3.1
The Ground Penetrating Radar (GPR) .................................................................................................... 8
3.1.1
A, B, and C-Scan............................................................................................................................. 9
3.1.2
Preprocessing ................................................................................................................................ 12
3.2
The Infrared Sensor (IR) ....................................................................................................................... 14
3.2.1
Detectors ....................................................................................................................................... 14
3.2.2
Dynamic Thermography ............................................................................................................... 14
3.3
The Ultra Sound Sensor (US)................................................................................................................ 17
3.4
Summary and Remarks ......................................................................................................................... 18
Chapter 4.
4.1
Filtering ................................................................................................................................................. 20
4.1.1
4.2
Signal and Image Processing......................................................................................................... 20
The Wiener Filter .......................................................................................................................... 20
Feature Extraction ................................................................................................................................. 21
4.2.1
The Karhunen and Loeve Transformation (KLT) ......................................................................... 21
4.2.2
The Kitller and Young Transformation (KYT)............................................................................. 25
4.3
Gray Scale Morphology Application .................................................................................................... 27
4.3.1
Operators ....................................................................................................................................... 28
4.3.2
Morphological Gradient ................................................................................................................ 31
4.3.3
Smoothing and Noise Reduction Using the Alternating Sequential Filter .................................... 31
4.4
Contrast Enhancement........................................................................................................................... 34
4.4.1
Morphological Contrast Enhancement .......................................................................................... 34
4.4.2
Histogram Equalization................................................................................................................. 35
4.5
Segmentation Using Watershed ............................................................................................................ 37
4.5.1
Basic Concept................................................................................................................................ 37
4.5.2
Geodesic Functions ....................................................................................................................... 38
4.5.3
Reconstruction............................................................................................................................... 39
4.5.4
General Watershed vs. Marker-Based Watershed......................................................................... 42
Chapter 5.
Proposed Method – The Selection Rule ........................................................................................ 46
5.1
Candidates ............................................................................................................................................. 46
5.2
Selection Rule ....................................................................................................................................... 46
Chapter 6.
Experiments................................................................................................................................... 50
6.1
Procedure............................................................................................................................................... 50
6.2
Case Study 1.......................................................................................................................................... 52
6.3
Case Study 2.......................................................................................................................................... 56
6.4
Case Study 3.......................................................................................................................................... 59
6.5
Case Study 4.......................................................................................................................................... 63
Chapter 7.
Conclusions ................................................................................................................................... 70
Chapter 8.
References ..................................................................................................................................... 71
Chapter 1
Introduction
Chapter 1. Introduction
The object and structure of this report is introduced in this chapter. Section 1.1 describes the current mine
problem and the aim of this report. This section is intended to give the motive of the mine detection research to
the readers. Section 1.2 explains the structure of this report and gives a brief description of the content of each
chapter.
1.1 Mine Facts and Motivation
There are more than 100 million antipersonnel or antitank mines in more than 70 countries at this moment.
Table 1-1 shows how many landmines have been distributed in the world.
Countries
Mines (million)
UN
2
USSD
3
Cleared Mines
Mined Area
2
Cleared Area
Casualties1
2
(km )
(km )
Afghanistan
10
7
158,000
550~780
202
300~360/month
Angola
15
15
10,000
Unknown
2.4
120~200/month
Bosnia
3
1
49,010
300
84
50/month
Cambodia
6
6
83,000
3,000
73.3
38,786 or 100/month
Croatia
3
0.4
8,000
11,910
30
677
Egypt
23
22.5
11,000,000
3,910
924
8,301
Eritrea
1
1
Unknown
Unknown
2.48
2,000
Iran
16
16
200,000
40,000
0
6,000
Iraq
20
10
37,000
Unknown
1.25
6,715
Laos
NA
NA
251
43,098
Unknown
10,649 or 16~18/month
Mozambique
3
1
58,000
Unknown
28
1,759
Somalia
1
1
32,511
Unknown
127
4,500
Sudan
1
1
Unknown
800,000
0
700,000
Vietnam
3.5
3.5
58,747
Unknown
65
180/month
Table 1-1 Worldwide landmine distribution and clearance performance [1], [2]
It has been estimated that more than 26,000 people are killed or maimed by mines every year, which is one
victim every 20 minutes. Some countries have severe mine casualties, and most of the victims are children. In
1
Casualty reporting varies drastically among countries; estimates provided by UN or the host government.
2
UN Landmine Database 1997 [1]
3
US State Department Report “Hidden Killer 1998: The Global Landmine Crisis” [2]
-1­
Chapter 1
Introduction
Cambodia, one out of every 236 people is a landmine amputee. The casualty ratio rises to one out of every 140
people in Angola, where there are more mines than people in the country. Since manual demining is extremely
dangerous work, one deminer has been killed for every 2,000 mines removed, as well as civilian victims [1].
The cost to buy and lay a typical antipersonnel mine is between $3 and $30, while the cost to remove a mine is
between $300 and $1,000. The European Commission and the United States have invested 138 million dollars
for mine actions for the last two years [1], but this is just the tip of the iceberg considering the present clearance
rate. In 1994, approximately 0.2 million mines were removed, while 2 million new mines were planted. Many
experts believe that it would take more than 10 centuries to remove every mine in the world with the current
clearance rate, even if no additional mines are planted [3].
Besides human victims and invested money and time for demining, mines cause large area of fertile farmland
and waterways to be inaccessible. In Cambodia, between 25 and 40% of the rice fields have been mined and
abandoned. Considering that many mine-afflicted countries are still in poverty, this is a serious problem.
As current mines do not have metallic material as much as older types, it is difficult to detect mines by the
current employed technology, the metal detector (MD). Also, MD has been reported making too many false
alarms in the former battlefield due to small fragments of munitions. Manual detection, called probing, works
well for all kinds of mines, but the cost of labor and slow speed are encouraging the development of other
techniques.
Although some military demining equipment has been developed and used during the Gulf War by the US
Army, civilian purpose demining, called humanitarian demining, is quite different from the military work. The
object of the humanitarian demining is to find and remove abandoned landmines without any hazard to the
environment. These landmines had been intended for military use when they were planted, but their duty has
expired. Also, humanitarian demining equipment is required to be very accurate, while military equipment can
afford some casualty risk.1
Many kinds of technology in the area of sensor physics, signal processing, and robotics have been studied for
mine detection during the last decade. This paper will introduce some sensor technologies and signal processing,
specifically the image processing method. The method and purpose will be limited within the scope of
humanitarian demining rather than military. The object of this paper is to survey, document, and implement
those techniques.
1
UN requires a probability of 99.96% to detect mines, accuracy of finding a 4cm radius object at 10cm depth, and
localization ability of up to a radius of 0.5m [1].
-2­
Chapter 1
Introduction
1.2 Structure of This Project Report
Chapter 2 will explain the target of this work, mines. The mines will be categorized by their structure and
purpose. Chapter 3 will introduce sensor technology. The content will be divided into three parts, the ground
penetrating radar, the infrared sensor, and the ultrasound sensor. Some physical explanation of how the sensors
work and some hardware related, low-level, signal processing will be introduced. Chapter 4 will introduce some
high-level signal processing, especially image processing techniques to visualize and recognize mine targets.
The content is divided into five parts; filtering, feature extraction, gray-scale morphology application, contrast
enhancement, and segmentation. A method to find mine target from a set of candidates will be proposed in
Chapter 5. Chapter 6 will present experimentations with the techniques introduced in this report.
-3­
Chapter 2
Classification of Mines
Chapter 2. Classification of Mines
This chapter explains the targets of this work, abandoned mines.
We are looking for abandoned explosives, whose purpose has expired. According to their original purpose,
they can be categorized into three types, antipersonnel mines (APM), antitank mines (ATM), and unexploded
ordnance (UXO) as shown in Table 2-1.
Type1
APM
ATM
UXO
Weight
Light (100g ~ 4kg)
Heavy (6kg ~ 11kg)
Various
Size (diameter)
6 ~ 15cm
13 ~ 40cm
Various
Target
Human
Vehicle
None intentional, but can be anything
Case Material
Plastic
Plastic, metal
Mostly metal
Detonatable Pressure2
500g
120kg
Unpredictable
Table 2-1 Classification of mines [4]
Generally, UXO means misfired shells or bombs that should have been already exploded but still remain for
some reason. UXO are usually found under the former battlefield. MD can easily detect them because they were
not intentionally hidden.
Many ATM include metallic material, and their size is bigger than the size of APM as shown in Table 2-1.
Since they have been designed to destroy only vehicles, they are detonated by high pressure or the existence of a
big metallic object. Thus, this demining work for ATM is relatively safer than the APM case. Also, detection is
relatively easier because of their size and material.
1
Some references define UXO as the more general category including all kinds of mines.
2
The numbers present the minimum pressure to detonate the most sensitive mine in each category.
-4­
Chapter 2
Classification of Mines
(a)
(b)
Figure 2-1 Examples of ATM; (a) TM-62M, (b) TMA-2 [4]
Figure 2-1 shows some examples of ATM. TM-62M is an example of a big metallic case mine. Its diameter is
31cm [4]. MD can easily detect this type of mine. Its detonator is so insensitive that deminers can approach
closer. TMA-2 is an example of a plastic case ATM. Table 2-2 shows the specifications of the mines in Figure
2-1.
Type
TM-62M
TMA-2
Dimension
Height 112mm, Diameter 316mm
Height 140mm, Width 260×200mm
Weight
8.47kg
7.5kg
Case
Steel
Plastic
Sensitivity
200kg
120kg
Manufactured Nation
Former Soviet Union
Former Yugoslavia
Table 2-2 Specifications of the ATM in Figure 2-1 [4]
APM are most difficult to find and remove, and have damaged the most civilian victims. Most APM are made
of nonmetallic material, and their size is much smaller than the size of ATM. The detonator is very sensitive to
pressure. APM can be divided into three types; blast, bounding fragmentation, and directional fragmentation [4].
The blast type mines are the most common targets for the humanitarian demining work. Their size is relatively
smaller and lighter than other types. Sometimes they move by floating on rivers. They are usually buried
underground, but some models can be installed by scattering from an airplane. Therefore, they can be found in
any place, underground, on the surface, at a riverside. Since their mechanism is simple and their material is
cheap, small military groups have been able to manufacture this type of mine. This fact has resulted in serious
mine problems for some poor countries that cannot afford the investment for the demining work.
The bounding fragmentation type mines are relatively bigger than the blast type, but they can destroy a larger
area, while the blast type mines damage only the target within a limited distance. They are buried underground
-5­
Chapter 2
Classification of Mines
or fixed on the ground. Direct pressure or a trip wire activates their detonator. Once the trigger is activated, they
bound up to some altitude and explode spreading out their fragments up to an area of 30m in radius, which is
very lethal.
Most directional fragmentation type mines are set up on the ground, and they spread their fragments in a
specific direction. Some model’s lethal range is more than 200m. Since they are detonated by manual operation
as well as a trip wire, sometimes this type of mine is considered an active weapon.
(a)
(b)
(c)
(d)
Figure 2-2 Examples of APM; (a) PRB-M35, (b) PMN, (c) VALMARA-69, (d) MON-100 [4]
Type
PRB-M35
PMN
VALMARA-69
MON-100
Dimension1
H58mm, D64mm
H56mm, D112mm
H105mm, D130mm
H82mm, D236mm
Weight
158g
600g
3.3kg
5.0kg
Case
Plastic
Rubber
Plastic
Steel
Sensitivity
8kg
8kg
Lethal Range


Radius 27m
100m by 9.5m arc
Nation
Belgium
Former Soviet Union
Italy
Former Soviet Union
10.8kg directly,
6kg through trip wire
Depend on fuse
Table 2-3 Specifications of the APM in Figure 2-2 [4]
Figure 2-2 shows some examples of APM [4]. Table 2-3 is the specifications of the mines in Figure 2-2. Both
the PRB-M35 and the PMN are blast type mines, which can be detonated by 8kg of pressure. The PRB-M35 is
an example of the smallest sized mine. Its diameter is 6cm, as small as the diameter of a Coke can. If they are
buried or scattered on the ground with vegetation, it is very difficult to find this type of mine. Also, some lighter
mines actually float on the water, so their distribution is unpredictable after heavy rains or flooding. The PMN is
an example of a nonmetallic mine. Its cover is a rubber plate. This is also a good example of a low cost mine.
The Valmara-69 is an example of a bounding fragmentation type mine. Once it is detonated, it propels up and
1
H means height, and D means diameter.
-6­
Chapter 2
Classification of Mines
explodes spreading out 2,000 fragments up to a 27m-radius area. The MON-100 is an example of a directional
fragmentation type mine. Its lethal range is up to 100m covering a 9.5m arc.
Most mines are now made of nonmetallic material. Also, the usual targets of the humanitarian demining work
are too small to be identified by human eyes. These facts encourage development of a low cost mine detection
technique useful for small nonmetallic mines.
-7­
Chapter 3
Sensor Technology
Chapter 3. Sensor Technology
Since military parties used MD to find mines in World War II, many kinds of sensors have been tried for mine
detection applications. Three of them are introduced in this chapter, the ground penetrating radar, the infrared
sensor, and the ultra sound sensor. The intension of this chapter is not to go through the complicated physics
principles of how the sensors work but to give some brief information about the three sensors and some low
level signal processing techniques related to hardware. High-level signal processing techniques, above the
sensor level, will be introduced in Chapter 4.
3.1 The Ground Penetrating Radar (GPR)
While most sensors are passive devices, GPR actively emits electromagnetic (EM) waves through a wideband
antenna and collects the signal reflected from its surroundings. The principle of GPR is somewhat similar to
seismic wave measurement except for the signal. The frequency band of the EM waves is between 100 MHz and
100 GHz [5]. This is a fairly high frequency band.
A reflection occurs when the emitted signal encounters a surface between two electrically different materials.
The intensity and direction of the reflection depend on two factors, the roughness of the surface and the electric
property of the medium material [5]. A rough surface reflects the incoming radiation in a diffused manner, while
a smooth surface tends to reflect at the same angle as the incoming radiation with respect to the surface normal.
The electric property of the medium determines the refraction and absorption level of the EM waves and
subsequently affects the direction and intensity of the reflection.
The penetration depth of the signal into soil usually depends on two factors, the humidity in the soil and the
wavelength of the signal [5]. The content of water in the soil seriously limits the penetration level as does the
shorter wavelength of the signal. Due to the reflection and penetration properties, the desired conditions for the
mine detection application using GPR is dry sand and a low frequency signal, but unfortunately the low
frequency signal tends to make low resolution images. Since the EM waves cannot penetrate water, GPR cannot
detect mines located underwater, which is a common case in some countries [6].
Two kinds of information are provided by GPR. The first is the presence of an object, detected by an
interruption of the signal. The second is the position of the object, measured by the time delay ∆t between
emitting a signal and receiving the reflection as
R=
v
∆t ,
2
where ν is the velocity of the EM waves in the medium, and R is the detected position of the object [5].
-8­
(3.1)
Chapter 3
Sensor Technology
Since many parameters of the EM waves, as well as the velocity, are varied by the soil content, the soil
parameter estimation should be done prior to the measurement [7].
3.1.1 A, B, and C-Scan
This section explains the format of GPR data.
GPR data are classified as A, B, and C-scan according to the dimension.
An A-scan is obtained by a stationary measurement, emission and collection of a signal after placing the
antenna above the position of interest. The collected signal is presented as signal strength vs. time delay.
Figure 3-1 An example of an A-scan (1×500) [28]
Figure 3-1 is an example of an A-scan acquired with an ultra wide band (UWB) GPR in laboratory conditions.
A PMN APM, 112mm in diameter and 56mm in height, is buried at 5cm depth in a sandbox of 50cm × 50cm.
2500 measurements were done at intervals of 1cm in both directions. 500 data points were sampled at intervals
of 10 psec per measurement [28].
There are two peaks between data points 50 and 200. They indicate interruptions along the downward path.
The positions of these peaks correspond to the distance between the antenna and the reflecting surface. The first
peak is the air-ground reflection, and the second is the mine target.
An A-scan can be mathematically represented as
f (z ) = A(xi , y j , z k ),
(3.2)
where i and j are constant, and k varies from 1 to N, the number of data samples [6].
A B-scan is obtained as an ensemble of A-scans, measured by a line scan. The collected signal is presented as
intensity on the plane of scanned width vs. time delay.
-9­
Chapter 3
Sensor Technology
Figure 3-2 An example of a B-scan (500×50) [28]
Figure 3-2 is the B-scan of the same object shown in Figure 3-1. One B-scan consists of 50 A-scans. The
vertical axis corresponds to the horizontal axis of Figure 3-1, and the horizontal axis represents the scanned
width, which is the number of A-scans. The intensity or color of each pixel indicates the signal strength,
corresponding to the vertical axis of Figure 3-1. A hyperbola shape is observed at data point 150, and a
horizontal line at data point 100. The horizontal line is the air-ground surface, and the hyperbola shaped object is
the mine target. The A-scan could detect only the existence of the two objects in Figure 3-1, but a B-scan can
distinguish a mine-like target from the air-ground surface and can give more information about the position of
the object as shown in Figure 3-2.
A B-scan can be mathematically represented as
f ( x, z ) = A(xi , y j , z k ) ,
(3.3)
where j is constant, i varies from 1 to L, the distance of antenna movement, and k varies from 1 to N, the number
of data samples [6]. A B-scan can be presented as a two-dimensional image, while an A-scan is a onedimensional graph.
A C-scan is obtained as an ensemble of B-scans, measured by repeated line scans along a plane, which is
parallel to the surface. The collected signal is presented as intensity in a box of scanned region vs. time delay.
A C-scan can be mathematically represented as
f ( x, y ) = A(xi , y j , z k ) ,
(3.4)
where i and j vary from 1 to L and M, and k varies from 1 to N [6]. L and M indicate the size of the scanned area,
and N indicates the number of data samples. A C-scan is a three-dimensional data structure, while a B-scan is a
two dimensional image. Since it is difficult to visualize a three-dimensional structure, a C-scan is usually
represented as a horizontal slice for a specific data point, which indicates the depth level. Both horizontal and
- 10 ­
Chapter 3
Sensor Technology
vertical axes correspond to the horizontal axis of a B-scan, and the depth level corresponds to the vertical axis of
the B-scan.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
Figure 3-3 An example of a C-scan (50×50) [28]; horizontal slice at the data point from 133 to 166 at intervals of 3
depth points
Figure 3-3 is the C-scan of the same object of Figure 3-1 and Figure 3-2. Figure 3-1 was acquired at the point
(25,25) heading down. Figure 3-2 was acquired at the line x = 25 also heading down. (a) is acquired at 1.33 nsec
after the signal emission, and the subsequent images are acquired at intervals of 0.03 nsec. Since these images
show horizontal conditions of the test area, we cannot clearly see the air-ground surface. However, we can
distinguish the mine target from its background, and even see the shape of the target. Figures (c), (d), and (e)
show the upper level of a PMN APM as two parts, a small detonating cap and a large cylindrical case. Figure (h)
and (i) are assumed to be the bottom part of the target. Considering the actual size of the target and the scanned
area, the visualized shape looks obviously bigger than its real size.1 This magnified distortion can be reduced by
the migration of a B-scan [7].
1
The diameter of a PMN is 112mm, and the size of the scanned area is 50cm × 50cm.
- 11 ­
Chapter 3
Sensor Technology
3.1.2 Preprocessing
This section explains some low level signal processing techniques of GPR as preprocessing methods.
As Figure 3-2 indicates, targets tend to show a hyperbolic shape in the B-scan. This is due to the fact that the
EM wave propagates in an omni-directional way.
Figure 3-4 A schematic diagram of how a B-scan is obtained [7]
Figure 3-4 is a schematic diagram of how a B-scan is obtained. An antenna moves along a parallel line to the
surface, which is the horizontal direction, and acquires the reflected signal from the object at regular intervals.
Each vertical line indicates one A-scan, and black dots represent the position of the impulse, which indicates the
existence of the object. Since the object is close to position A, the measured time delay is as short as d, while it
is as far from position B or C as d + ∆t . Therefore, the time delay ∆t has information about the position of the
object. It can also be varied by the local soil condition, because the electric property of the medium affects the
velocity of the EM wave.
The hyperbola gives information about the existence and position of the object and the soil condition.
Hyperbolas can be detected by the Hough transformation. In order to detect hyperbolas, each hyperbola must be
separated from the background clutter prior to the detection process [7].
Although the Hough transformation may detect the hyperbolas, it does not improve the magnified distortion of
a C-scan. Migration is a technique used to provide an exact physical location and shape of the reflectors in the
subsurface. The idea of migration is to recombine scattered measured points into one position [7], which means
recombining the black dots in Figure 3-4 around position A.
- 12 ­
Chapter 3
Sensor Technology
(a)
(b)
Figure 3-5 Clutter removal; (a) original GPR image, (b) image with background clutter removed [8]
One of the major problems in GPR data processing is to remove the air-ground reflection. This is the dominant
horizontal line in Figure 3-5 (a). Also, the background clutter, caused by inhomogeneous soil content, is another
undesirable property. An estimation of clutter is necessary to remove clutter from data, and sensor calibration
can be used for clutter estimation [9].
- 13 ­
Chapter 3
Sensor Technology
3.2 The Infrared Sensor (IR)
IR radiation is a portion of the EM spectrum lying between the visible rays and microwaves regions with
wavelengths between 0.75µm and 1mm [5]. Although all EM radiation produces heat, IR radiation can be more
readily detected for the heat. Heated materials provide good sources of infrared radiation. Therefore, IR
radiation is referred to as thermal radiation.
Since visualization is easier than other sensors, IR has been widely used for mine detection. Another advantage
of IR is that it does not need as much serious preprocessing as GPR. However, the performance of IR relies
highly on the environment at the moment of measurement.
Uniquely, IR can work in either way, actively or passively. It can work by accepting only the natural radiation
from the object, or it can provide an extra heat source and receive the artificial radiation created by that heat
source [10].
3.2.1 Detectors
The detector is a transducer converting the energy of EM radiation into an electrical signal. There are two
kinds of IR detectors, the photon detector and the thermal detector [11]. A basic distinction between the two
detectors exists in the manner of how they respond to radiation.
The photon detector or the photon counter essentially measures the rate at which quanta are absorbed, whereas
the thermal detector measures the rate at which the energy is absorbed [11]. Therefore, the photon detector is the
selective detector of infrared, responding only to those photons of sufficiently short wavelengths. Their response
at any wavelength is proportional to the rate at which photons of that wavelength are absorbed.
Thermal detectors respond to only the intensity of absorbed radiant power disregarding the spectral content
[11]. Thus, they respond equally well to the radiant energy of all wavelengths.
3.2.2 Dynamic Thermography
The general concept of using infrared thermography for mine detection is based on the fact that mines may
have different thermal properties from the surrounding material.
If a whole scene is submitted to an energy flux that varies with time, the objects will follow a temperature
curve that will not coincide with the soil. When this contrast is due to the alteration of the heat flow by the
presence of the buried mine, it is called volume effect [12]. When the contrast results from the disturbed soil
layer created by the burying operation, it is called surface effect [12]. The surface effect can be detectable for
only some time after burial, but during that time the thermal contrast is quite distinctive.
Once a sequence of images has been acquired, some processing techniques can be applied to enhance the
contrast between the possible targets and the background. This is called dynamic thermography [13].
- 14 ­
Chapter 3
Sensor Technology
Figure 3-6 Thermal effect; (a) volume effect, (b) surface effect
The following IR sequence image is an example of dynamic thermography. These data were collected with the
E-OIR Amber Galileo LWR sensor, which is available at the 3 ~ 5µm band, at the Fort Belvoir test minefield in
Virginia. The sensor was located on a tripod in a remote place. Data were captured at intervals of 15 minutes
starting at 3:00 pm [29]. Figure 3-7 shows the first image.
Figure 3-7 An IR image taken at 3:00 PM1 (256×256) [29]
Name
M15
M19
PGMDM
RAAM
FFV028
TN62
VS16
Condition
Buried
Buried
Surface
Surface
Buried
Buried
Buried
Position (256,256)
67,50
139,56
191,42
222,42
75,109
131,109
204,112
Table 3-1 Objects of Figure 3-7 [29]
Seven types of mines, as shown in Table 3-1, were placed in the 8’×8’ test area.
Since it is very difficult to get sufficient information from this single image, the change, using the time
sequence in the black box in Figure 3-7, is observed in Figure 3-8, selected at intervals of 8 hours from 3:00 PM.
1
The contrast is enhanced.
- 15 ­
Chapter 3
Sensor Technology
(a)
(b)
(c)
(d)
(e)
(f)
Figure 3-8. The time-varied images of the area of interest (222×140) of Figure 3-7 selected at intervals of 8 hours
from 3:00 PM
The contrast in each image is limited, and noise signals are dominant through all image sequences in Figure
3-8. Some post-processing for contrast enhancement and noise removal is required. These methods will be
introduced in Chapter 4.
- 16 ­
Chapter 3
Sensor Technology
3.3 The Ultra Sound Sensor (US)
The frequency range of sound in which average people can hear, is between 20 and 20,000 Hz. Ultra sound
waves are the sound waves in the frequency band above this audible range. The principle of US is very similar
to GPR except for the signal. Both sensors emit an active signal and collect reflections from the surroundings.
However, sound propagates as a mechanical disturbance of molecules in the form of waves [6], while GPR
signals make no physical disturbance in the medium. When a sound wave propagates through a medium, the
wave consists of the molecules of the medium oscillating around their equilibrium position, but there is no
propagation of material just a transmission of disturbance and propagation of only the sound energy.
The speed of sound is dependant on the physical properties of its medium, density and elasticity. The basic
definition of the speed of sound is
c = f ⋅ λ [m / sec ],
(3.5)
where c is the speed, λ is the wavelength, and f is the frequency. Sometimes c is called a material constant
because it is constant for a certain material [6].
In a uniform homogeneous medium, the US wave propagates along a straight line and is reflected and refracted
when the wave encounters a boundary between two different media. Then, two factors affect the behavior of the
US wave at the boundary, the speed of the wave and the density of the media. In mine detection, the frequency
of US decides the penetration depth as it does for GPR. The lower frequency wave tends to penetrate better than
the high frequency one [6].
The US wave propagates well in humid or underwater conditions, but greatly attenuates in the air, while the
EM wave of GPR behaves oppositely in the same conditions [5]. Therefore, US is very good and almost the
only sensor used for underwater mine detection.
Table 3-2 profiles examples of the speed of sound according to the media. Usually, harder material tends to
transmit waves faster.
Material
Speed of Sound [m/s]
Steel
5000
Lead
1300
Water
1460
Soft tissues
1500
Bones
2500 to 4900
Table 3-2 Some examples of the speed of sound in different materials [6]
The visualization method for US is almost the same as for GPR. The data is visualized from A, B, and C-scans.
- 17 ­
Chapter 3
Sensor Technology
3.4 Summary and Remarks
MD has been the most frequently used sensor in mine detection. With the current technology, MD can detect
an extremely small metal pin in a plastic APM. It can track 0.1g of metal at a depth of 10 cm. Although MD
makes too many false alarms caused by small debris, its high detection ability has helped it to become standard
equipment. In practical mine detection cases, deminers scan the suspected area with MD once, and then use
other sensors to verify the alarms [12].
The most studied method for mine detection in the last decade is GPR. GPR has been used most frequently
after MD in actual minefields. Some recent research has reported that GPR can recognize targets buried at about
30 cm in clay [14]. However, the high frequency, up to several GHz, limits its penetration depth and increases
clutter. The high price makes actual demining groups hesitate to use it.
IR may be the easiest method to visualize data, since the collected data has a two-dimensional image format.
Despite the ease of implementation, very sensitive cameras must be employed for sufficient spatial resolution
and necessitate a high cost to build this system. Maximum burial depth has been estimated at about 10-15 cm
[15, 16]; however, results obtained by passive infrared images heavily depend on the environmental conditions,
and there are crossover periods when the thermal contrast is negligible and mines are undetectable during
morning and evening [10].
US may be used in different situations than other sensors. Since the US waves can propagate well in humid or
underwater conditions [6], it would be the best method in flood situations like rice fields or swamps, which are
common cases for some mine-afflicted countries.
Table 3-3 presents the approximated performance of four major sensors used in mine detection applications.
This is based on a literature survey, but the absolute performance is hard to define.
GPR
MD
IR
US
Sand
|
|
|
±
Clay
U
U
±
±
Water
±
±
±
|
Deep Earth
|
|
±
±
APM
|
U
U
|
ATM
|
|
|
|
Price
±
|
U
|
Table 3-3 Sensor performance in various situations, | good, U not bad, ± bad
Some nuclear technology has been developed to detect the explosive in mines. This process works by exposing
a target to thermal neutrons and analyzing the gamma rays resulting from the neutrons absorbed by the target
- 18 ­
Chapter 3
Sensor Technology
[17]. Since this process involves radioactive material, its use for outdoor purposes is difficult. However, if mines
have been developed to avoid all kinds of detectors, in the near future they may not be found with any of the
other sensors. The eventual solution will be chemical or nuclear sensors which look for the explosive in mines,
especially if these sensors can be used in safe, cheap, and reliable methods.
- 19 ­
Chapter 4
Signal and Image Processing
Chapter 4. Signal and Image Processing
This chapter may be the most important part of this report. Some signal and image processing skills, which can
be used in mine detection applications, are explained. This chapter covers mostly theoretical background, and
Chapter 6 will present the experiments.
4.1 Filtering
Regardless of the type of sensor, noise is always present. This report introduces two methods for the filtering
topic; the Wiener filter and the alternating sequential filter. The Wiener filter has been known in the image
processing area as the least mean square method [18]. The alternating sequential filter is an application of grayscale morphology [11]. The Wiener filter will be introduced in the next section, but the alternating sequential
filter will follow later in Section 4.3.3 as a part of gray-scale morphology.
4.1.1 The Wiener Filter
The Wiener filter is an adaptive digital filter based on the mean square estimation. The idea is to restore the
original signal by minimizing the mean square error between the estimated signal and the original one. The
constraint of this method is that statistical information about the noise factor is required, and the noise is
assumed as the Gaussian distribution [18]. The algorithm follows. [7]
The posed problem can be assumed as
g (t ) = f (t ) + n(t ) ,
(4.1)
where f(t) is the original signal, g(t) is the acquired signal, and n(t) is noise.
The estimation of f(t) as fˆ (t ) can be written as
fˆ (t ) = ∑ h(k )g (t − k ) ,
(4.2)
k
where h(k) is independent from t. The problem is how to find h(k) at this moment. This can be calculated by
minimizing the approximation error,
2

2
 


ˆ
J = E e (t ) = E f (t ) − f (t ) = E  f (t ) − ∑ h(k )g (t − k )  .


k
 

[
2
]
(
)
(4.3)
The minimum J can be achieved by
 
∂J
 de(t ) 
= E 2 f (t ) − ∑ h(k )g (t − k )
 = 0,
∂h(i )
k
 dh(i ) 
 
where
- 20 ­
(4.4)
Chapter 4
Signal and Image Processing
de(t )
= − g (t − i ) .
dh(i )
(4.5)
Equation (4.4) can be expressed by a correlation as
R fg (i ) = ∑ h(k )R gg (i − k ) .
(4.6)
k
Since the spectrum of the signal corresponds to the Fourier transformed correlation, equation (4.6) can be
rewritten as
S fg ( f ) = H ( f )S gg ( f ) .
(4.7)
If we can estimate the noise spectrum Snn(f) and assume that the noise and the original signal are uncorrelated,
we can estimate Sfg(f) by
S fg ( f ) = S ff ( f ) = S gg ( f ) − S nn ( f ) .
(4.8)
Then, h(k) can be solved by
H(f ) =
S fg ( f )
S gg ( f )
.
(4.9)
The Wiener filter has been noted to perform well for the GPR signal, though the problem of estimating the
noise spectrum Snn(f) remains unsolved. This problem may be approached by a statistical diagnosis of the overall
signal.
4.2 Feature Extraction
The dynamic thermography and the GPR C-scan normally have a large set of image sequences. This section
provides information on how to extract features from multiple data.
4.2.1 The Karhunen and Loeve Transformation (KLT)
KLT is also called the Hotelling transformation or the principle component analysis (PCA) [18]. When we
have a large set of image sequences, the number of image samples needs to be reduced for global processing
unless images are selected randomly. KLT is a method to reduce the number of images while minimizing the
representation error.
Assume a gray scale image xnm, where n = 1, 2, …, N, m = 1, 2, …, M, N is the number of images in a sequence,
and M is the number of pixels in an image. A vector corresponding to one pixel position along an image
sequence is called a dixel or a dynamic pixel. A dixel represents the dynamic thermal evolution of a point in an
N dimensional space [13].
The idea of KLT is that dixels originating from the same object tend to form a cluster. The main axes of the
transformed space are the directions, which maximize the distinction between clusters [19].
- 21 ­
Chapter 4
Signal and Image Processing
Figure 4-1 A diagram of an image sequence with m pixels and n images
The position of a dixel is given as a vector y m = {y1m , y 2m ,…, y Nm } in an N dimensional Euclidean space.
The m indicates the mth pixel in each image. Figure 4-1 shows an example of an image sequence. This sequence
has m pixels in each image and n images in a sequence. The arrow going through the circled pixels indicates one
dixel. The algorithm follows [13].
Since the gray value of pixels can be distributed along different ranges in each image, it is necessary to
normalize each image by subtracting its average value.
x nm = y nm − µ n ,
(4.10)
where
µ n = E[ y nm ] =
M
1
M
∑y
1
M
∑x
m =1
nm
.
(4.11)
nm
= 0.
(4.12)
This process makes xnm have zero mean as
µˆ n = E [x nm ] =
M
m =1
A unity vector u can convert the image vector xm into a parameter rm as
rm = u T x m ,
(4.13)
where u has a binding condition,
u 2 = u T u = 1 or g (u ) = u 2 − 1 = 0 .
(4.14)
The mean value of rm should be equal to zero by the normalization of the dixel cloud.
E [rm ] = 0
(4.15)
The goal of this method is to find the optimal values of u, which makes the variation of rm maximized.
[ ]
f (u ) = var(rm ) = E rm2
- 22 ­
(4.16)
Chapter 4
Signal and Image Processing
The main axis is r1, and the unit vector is u1 in Figure 4-2.
Figure 4-2 A dixel space [11]
The solution can be found by the Lagrange multipliers as
δf − λδg = 0 .
(4.17)
The variation of rm can be rewritten as a function of u as
[
]
[
]
f (u ) = E u T x m x mT u = u T E x m x mT u = u T Cu ,
where C denotes the covariance matrix as
[
]
C = E x m x mT .
(4.18)
(4.19)
The Lagrange multiplier equation (4.17) can be interpreted as two parts, δf and δg.
δf is interpreted as equation (4.20) from equation (4.18),
N

i =1

δf (u ) = ∑  δu i
∂ 
f ,
∂u i 
 ∂ 
 ∂ T
∂
f =
u Cu + u T C 
u  = e iT Cu + u T Ce i ,
∂u i
 ∂u i 
 ∂u i

(4.20)
(4.21)
where ei is the ith unit vector.
Since the covariance matrix C is a symmetric matrix, equation (4.21) can be reduced as
∂
f = 2e iT Cu .
∂u i
(4.22)
δg is interpreted as equation (4.23) from equation (4.14),
 ∂ T
 ∂
∂
g=
u u + u T 
∂u i
 ∂u i

 ∂u i
Finally, equation (4.17) is rewritten as,
- 23 ­

u  = e iT u + u T e i = 2e iT u .

(4.23)
Chapter 4
Signal and Image Processing
δf − λδg = ∑ [δu i e iT (Cu − λu )] = [δu1 δu 2
N
i =1
δu N ][Cu − λu ] = 0 .
(4.24)
Equation (4.24) can be solved only if Cu = λu . This is an eigenvector problem, where λ is the eigenvalue and
u is the corresponding eigenvector. ui represents N possible solutions for u, which result in the same number of
λ. From Cu i = λi u i , equation (4.18) is concluded as
f (u i ) = u iT Cu i = u iT λi u i = λi .
(4.25)
Equation (4.25) indicates that the eigenvector ui, corresponding to the largest eigenvalue λi, represents the
direction for which the quadratic moment is its maximum [13]. The parameter for the new orthogonal set of axes
ri, where i = 1, 2, …, N, can be produced from
ri = u iT x m .
(4.26)
Going back to Figure 4-2, KLT extracts two dixel axes r1 and r2 from a two-dimensional dixel space, ( x1 , x 2 ) .
Normally the dimension of the data of an actual mine detection case is greater. The GPR C-scan of Figure 3-3
has 500 images, and the dynamic IR of Figure 3-7 has 94 images.
By projecting dixels on the r1 axis, the feature extraction of an image sequence can be performed. Usually the
first orthonomal (orthogonal and normal) axis is considered as the feature direction, but sometimes multiple
directions can be considered for the optimal choice.
(a)
(b)
(c)
Figure 4-3 An IR image sequence of a minefield [30]; images taken at (a) noon, (b) afternoon, and (c) evening
Figure 4-3 shows an example of an image sequence. These images were captured at different times with an
infrared camera, AGEMA, available within a 3~5µm band, while keeping the same position and angle. (a) was
captured at about noon, (b) at 5 PM, and (c) at 10 PM. The entire set of images was captured 49 times during a
24 hour period.
- 24 ­
Chapter 4
Signal and Image Processing
Since the character of the transformed data by KLT is not the pixel value of a gray image but the relative
difference between each pixel, contrast enhancement is required in post processing. Contrast enhancement will
be introduced in Section 4.4.
(a)
(b)
(c)
Figure 4-4 Transformed image from Figure 4-3 by KLT; (a) 1st transformed image, (b) 2nd transformed image, (c)
8th transformed image 1
Figure 4-4 is the result of KLT from Figure 4-3. (a) is the first transformed image, (b) is the second, and (c) is
the eighth. Since the first transformed image is expected to have the most discriminative features, (a) shows the
most distinctive contrast. Unless some prior information is given, the first transformed image is used for the
feature data.
4.2.2 The Kitller and Young Transformation (KYT)
Because KLT treats all classes as a single scattergram, KLT chooses the main axes by considering the minimal
representation error rather than the maximum discrimination ability. If the noise component is prominent in the
entire sequence, noise can be considered as an important factor to select the main axes. KYT compensates for
the weak discrimination ability of KLT by normalizing the variance within the classes [13].
In this case, the total covariance matrix C can be split in
C =σ +µ ,
(4.27)
where σ is the covariance matrix within the classes and µ is the covariance matrix between the class averages.
The solution can be achieved by the solution of an eigenvalue and eigenvector problem as
µu = λσu ,
(4.28)
where the eigenvector ui, corresponding to the largest eigenvalue λi, provides the direction for which the
distinction between the classes is at its maximum.
1
Linear stretching method is used to enhance the contrast.
- 25 ­
Chapter 4
Signal and Image Processing
(a)
(b)
(d)
(c)
(e)
Figure 4-5. Process of KYT [13]
Figure 4-5 is the process of KYT [13]. When two dixel classes are given as shown in (a), KYT rotates the
original dixel classes as shown in (b). Then, it normalizes the variance within the classes as shown in (c), and
applies KLT to find the direction of main axes as shown in (d). Finally, it transforms the classes into the original
scattergram as shown in (e). The arrows KY1 and KY2 indicate the first and second transformed images by KYT
and KLT.
Although some concepts of KYT are similar to KLT in the sense of an eigenvalue problem, some additional
information should be determined to perform this transformation, for example, the class average, variance, and
relative weight. These classes are determined by delimiting the dixel clouds manually.
- 26 ­
Chapter 4
Signal and Image Processing
4.3 Gray Scale Morphology Application
The term morphology has been mostly used for the binary image case in image processing. Basic functions,
dilation and erosion, are performed by the structure elements with various shapes and sizes. The repetition of the
basic functions performs the second level functions, opening and closing. While mixing these operations of the
first and second level, a region-based processing (for example, boundary extraction, region filling, and thinning)
can be done by putting forward or pulling backward the image.
The gray-scale application is somewhat different from the binary image case. Smoothing, extracting gradient,
edge detection, noise removal, contrast enhancement, and finally a region-based segmentation, called the
watershed algorithm, can be done using dilation and erosion.
The structuring element (SE) should be explained prior to the actual morphology operators. SE is a simple
matrix or an image with a relatively smaller size than the object image. SE has two functions. It defines the
neighborhood around the origin, and it adds an offset value through the corresponding region of SE [20].
(a)
(b)
(c)
Figure 4-6. Examples of SE; (a) 5×5 octagonal neighborhood, (b) 5×5 diamond neighborhood, (c) 5×5 pyramid shape
offset
Figure 4-6 (a) is an example of an octagonal neighborhood. This SE is frequently used in mine detection
applications, because the octagonal shape is somewhat similar to the round shape of mines. The origin is circled.
The origin has a 5×5 neighborhood1 except for the four corners. When an object image meets these four corners,
the correspondent points will not be considered as neighborhood pixels. (b) is a diamond shaped neighborhood.
It has relatively fewer neighborhood pixels than the octagonal case. (c) is an offset distribution of a pyramid
shape. The origin has the highest offset, and the boundary pixels have relatively lower offsets. If every SE pixel
has the same offset value, it is called a flattop filter.
The basic operators, dilation, erosion, opening, and closing, will be introduced in the following section.
1
Sometimes n×n is represented as the size of radius r, from n = 2×r + 1. For example, 5×5 corresponds to r = 2.
- 27 ­
Chapter 4
Signal and Image Processing
4.3.1 Operators
Dilation and erosion are similar to the discrete two-dimensional convolution [18],
f ( x, y ) ∗ b( x , y ) =
1
MN
M −1 N −1
∑∑ f (m, n )b(x − m, y − n) ,
(4.29)
m=0 n =0
where x = 0, 1, 2, …, M-1 and y = 0, 1, 2, …, N-1.
Dilation of an image function f by SE b can be derived as
δ ( f ) = ( f ⊕ b )(s, t ) = max{f (s − x, t − y ) + b(x, y ) | (s − x ), (t − y ) ∈ D f ; (x, y ) ∈ Db },
(4.30)
where Df and Db are the domains of f and b [18].
The displacement parameter condition (s-x),(t-y)∈Df means that SE has to be completely contained by the set
being dilated. This corresponds to the 2D convolution in equation (4.29) with the max operation replacing the
sums of convolution and the addition replacing the products of convolution. f(-x,-y) is the flipped f(x,y) with
respect to the origin. As in convolution, the function f(s-x,t-y) means that the flipped f(x,y) is shifted by positive
s and t.
Since dilation is based on choosing the maximum value of f+b in a neighborhood defined by the shape of SE
the general effect of dilation on a gray-scale image has two parts. Firstly, if all offset values of SE are positive,
the output image tends to be brighter than the input. Secondly, the dark details of the input image are either
reduced or eliminated if the size of the dark area is smaller than SE.
Erosion of an image function f by SE b can be derived as
ε ( f ) = ( f ⊗ b )(s, t ) = min{f (s + x, t + y ) − b(x, y ) | (s + x ), (t + y ) ∈ D f ; (x, y ) ∈ Db }1,
(4.31)
where Df and Db are the domains of f and b [18].
In equation (4.30) and (4.31), the function of dilation and erosion is dual, while the condition is the same. The
function f(s+x,t+y) means that f(t) is shifted by negative s and t.
Since erosion is based on choosing the minimum value of f – b in a neighborhood defined by the shape of SE
the general effect of performing erosion on a gray-scale image is the opposite from dilation. Firstly, if all the
offset values of SE are positive, the output image tends to be darker than the input. Secondly, the bright details
of the input image are either reduced or eliminated if the size of the bright area is smaller than SE.
The opening of an image function f by SE b can be described as
γ ( f ) = f b = ( f ⊗ b) ⊕ b ,
(4.32)
which means the erosion of f by b, followed by a dilation by b [18].
The purpose of opening is to remove small bright details, while leaving the overall gray levels and larger bright
features relatively undisturbed. The initial erosion removes small bright details, but also darkens the image. The
1
The sign, ‘−’ inside of ‘Ο’, may be more common for erosion, but it is presented as ‘⊗’ in this report.
- 28 ­
Chapter 4
Signal and Image Processing
subsequent dilation increases the brightness of the image without reintroducing the bright details removed by the
previous erosion.
The closing of an image function f by SE b can be described as
ϕ ( f ) = f • b = ( f ⊕ b) ⊗ b ,
(4.33)
which means the dilation of f by b, followed by an erosion by b [18].
Closing is the dual function of opening. An opposite result from opening is expected as dilation and erosion.
Closing is generally used to remove small dark details, while leaving the overall gray levels and larger dark
features relatively undisturbed. The initial dilation removes small dark details but also brightens the image. The
subsequent erosion decreases the brightness of the image without reintroducing the dark details removed by the
previous dilation.
- 29 ­
Chapter 4
Signal and Image Processing
(a)
(b)
(c)
(d)
(e)
Figure 4-7 Examples of morphology functions; (a) original image, (b) dilated image, (c) eroded image, (d) opened
image, (e) closed image
Figure 4-7 shows examples of morphology functions. Table 4-1 profiles the average, minimum, and maximum
value of each image. (a) is the original image. (b) is dilated by a 5×5 flat octagonal shaped SE from (a). (c) is
eroded, (d) is opened, and (e) is closed by the same SE as (b).
Bright details are enhanced, and dark areas are shrunk in (b) and (e) by removing the dark pixels. This effect
seriously increases the average value of (b), but it does not in (e). Dark details are enhanced, and bright areas are
shrunk in (c) and (d) by removing the bright pixels. This effect seriously drops the average value of (c), but it
does not in (d).
Image
(a)
(b)
(c)
(d)
(e)
Average
98.7
120.7
78.3
92.2
105.3
Minimum
3
9
3
3
9
Maximum
238
238
213
213
238
Table 4-1 Average, minimum, and maximum value of Figure 4-7 1
1
With an 8-bit color map, the maximum gray value of a pixel is 255 and the minimum is 0.
- 30 ­
Chapter 4
Signal and Image Processing
4.3.2 Morphological Gradient
The main goal of the morphological gradient transformation is to highlight gray level contours.
When an image function f is continuously differentiable, the gradient is equal to the modulus of the gradient of
f,
2
2
 ∂f   ∂f 
g ( f ) =   +   .
 ∂x   ∂y 
(4.34)
The simplest way to approximate this modulus is to calculate the difference between the highest and the lowest
pixels within a window, centered at each point x [21]. In other words, it is the difference between the dilated
function δ ( f ) and the eroded function ε ( f ) as
g ( f ) = δ ( f ) − ε ( f ) .1
(4.35)
(a)
(b)
Figure 4-8 Morphological gradient; (a) original image, (b) gradient image
4.3.3 Smoothing and Noise Reduction Using the Alternating Sequential Filter
Opening and closing are introduced in Section 4.3.1. The combination of these operators can remove noise and
can smooth the texture in an image. This is called the alternating sequential filter (ASF).
Usually, ASF performs well as a repeated operation rather than a single operation.
Two kinds of ASF are defined here, white ASF and black ASF. The white ASF is defined as
Φ n ( f ) = ϕ 1γ 1ϕ 2γ 2ϕ 3γ 3
ϕ nγ n ,
(4.36)
where ϕ denotes the opening of the previous result, γ denotes the closing of the previous result, and the number
is the correspondent size of SE [11].
Equation (4.36) can be rewritten as
Φ n ( f ) = γ n (ϕ n
1
(γ 2 (ϕ 2 (γ 1 (ϕ1 ( f )))))) .
Some references define the morphological gradient as g ( f ) =
- 31 ­
[δ ( f ) − ε ( f )]/ 2 .
(4.37)
Chapter 4
Signal and Image Processing
The white ASF opens the object image with the smallest SE, and closes the previous result with the same SE. It
then opens again the previous result with the larger size of SE, and closes again with the same SE, etc.
The black ASF is the dual operation of the white ASF. Every step is the same as the white ASF except the
black ASF begins with a closing operation instead of an opening [11].
The black ASF is defined as
Ψn ( f ) = γ 1ϕ 1γ 2ϕ 2γ 3ϕ 3
γ nϕ n ,
(4.38)
(ϕ 2 (γ 2 (ϕ1 (γ 1 ( f )))))) .
(4.39)
which is the same as
Ψn ( f ) = ϕ n (γ n
The goal of ASF is to remove noise or to smooth an image without disrupting the major components of the
image. The result relies highly on the maximum size of SE. If a precise result is desired, only small SE will be
applied. Otherwise, filtering steps will be repeated until the desired result is obtained.
(a)
(b)
(c)
(d)
Figure 4-9 An example of white ASF; (a) original image [30], (b) 7×7 size filtering, (c) 15×15 size filtering, (d) 23×23
size filtering
Figure 4-9 shows the result of the white ASF from Figure 4-4 (a), which is the result of KLT from an IR image
sequence of a test minefield [30]. (b) is obtained by equation (4.37) when f is (a) and n is 7. (c) is obtained when
n is 15, and (d) when n = 23. Only the odd numbered SE have been applied, for example, 3×3, 5×5, 7×7, and
9×9. The large white circle located in the lower left corner of each image is suspected to be a mine, but the other
black and white dots or small circles are negligible. ASF is applied to remove those dots or circles. At the
segmentation step, (a) will cause over-segmentation, but (d) can be a reasonable condition for segmentation. The
segmented result will be presented in Chapter 6.
- 32 ­
Chapter 4
Signal and Image Processing
(a)
(b)
(c)
(d)
Figure 4-10 Intensity value on the black line in Figure 4-9; (a) ~ (d) corresponding to (a) ~ (d) of Figure 4-9
Image
(a)
(b)
(c)
(d)
Average
144.3
143.8
143.0
141.5
Minimum
0
20
57
68
Maximum
255
252
245
239
Table 4-2 Average, minimum, and maximum value of Figure 4-9
Figure 4-10 is the intensity value on the black line in Figure 4-9. The graphs (a), (b), (c), and (d) correspond to
(a), (b), (c), and (d) in Figure 4-9. Table 4-2 profiles the average, minimum, and maximum values of Figure 4-9.
The original image has many small peaks and valleys in Figure 4-10 (a). In each step, ASF smoothes an object
if it is smaller than SE. In (b), there are still some narrow valleys at x = 0~30, 90~110, 120~130 and peaks at x =
110~120, 130~140, 180~200, which are circled. These valleys and peaks disappear in (c), but the large white
circle remains even in (d).
In Table 4-2, the average gray value of the filtered image in Figure 4-9 has not significantly changed, even
though the maximum and minimum value have become closer to each other.
ASF has some advantages that other filters do not have. Firstly, the size of the object to be identified is
selectable by choosing the maximum size of SE. Secondly, ASF does not affect the overall property of the
image.
- 33 ­
Chapter 4
Signal and Image Processing
4.4 Contrast Enhancement
Since the contrast between the background and the mine target is usually not large enough, the raw sensor
image rarely has enough information. The purpose of contrast enhancement is to enhance the difference between
the mine target and the background to distinguish them. Two methods are introduced in this section;
morphological contrast enhancement and histogram equalization.
4.4.1 Morphological Contrast Enhancement
The first step in morphological contrast enhancement is to find peaks and valleys from the original image.
Peaks are light shades of gray tone image, while valleys are dark. Peaks are obtained by subtracting the opening
from the original image, and valleys by subtracting the original image from the closing as
p( f ) = f − γ ( f ) ,
(4.40)
v( f ) = ϕ ( f ) − f ,
(4.41)
where p(f) denotes the peaks, v(f) denotes the valleys, γ(f) denotes the opening, and ϕ(f) denotes the closing of
an image function f.
Preprocessing is necessary to improve the contrast [11]. This is done by multiplying constants with peaks and
valleys as
p ′( f ) = p ( f ) × c1 ,
(4.42)
where
c1 =
max( f ) − max(I )
.1
(4.43)
v ′( f ) = v( f ) × c 2 ,
(4.44)
max[ p( f )]
where
c2 =
min ( f ) − min (I )
max[v( f )]
.
(4.45)
The contrast-enhanced image is obtained by the summation of the original image, the peaks, and the negative
valleys [11].
f ′ = f + p ′( f ) − v ′( f )
(4.46)
An example of the morphological contrast enhancement is shown in Figure 4-11. The histogram equalization
may show a better result, which will be explained in the next section.
1
I indicates the entire gray level. The 8-bit gray level varies within [0 255]. max (I) is 255, and min (I) is 0.
- 34 ­
Chapter 4
Signal and Image Processing
4.4.2 Histogram Equalization
The probability of the kth gray level in an image f can be described as
p f ( fk ) =
nk
,
n
(4.47)
where k ∈ [0 L-1], L is the number of gray levels in an image, nk is the number of times the kth level appears in
the image, and n is the total number of pixels in the image.
A plot of pf(fk) versus k is called a histogram, and the goal of histogram equalization is to obtain an image with
a uniform histogram.
The uniform histogram can be achieved by
k
nj
j =0
n
gk = T ( fk ) = ∑
= ∑ p f ( f j ),
k
(4.48)
j =0
keeping two conditions [18],
(a) T(fk) is single valued and monotonically increasing in the range k ∈ [0 L-1].
(b) Also, T(fk) should be T(fk)∈ [0 L-1] for k ∈ [0 L-1].
The transformed image g has a uniform gray level probability.
p g (g k ) =
nk
=c,
n
(4.49)
where ideally c is a constant through the entire gray level k ∈ [0 L-1].
Figure 4-11 presents examples of various contrast enhancement methods. The images on the left are contrastenhanced images, the middle graphs are the gray level on the black line in the left hand images, and the right
hand graphs are the overall histogram of the left hand images. The original image in (a) is the same image as
Figure 4-3 (a), which was captured with an IR camera from a test minefield. There is a possible mine target in
the lower left corner of the image.
The gray level of the original image (a) is limited within a range of 150 to 200. Useful information cannot be
achieved in this situation. The linear stretched image (b) is better than (a), but most pixels are still distributed in
the upper half range of the gray level. Image (c) was enhanced by equation (4.46) using a 7×7 sized octagonal
SE, but the gray level was not enhanced enough. Small peaks and valleys can be easily removed by ASF, but
eventually a wider range of gray level is desired. The histogram-equalized image shows the best result in (d).
The histogram shows almost uniform distribution except for critically high or low levels. ASF can remove small
peaks and valleys easily, and only the suspected white circle will remain.
- 35 ­
Chapter 4
Signal and Image Processing
(a)
(b)
(c)
(d)
Figure 4-11 Examples of contrast enhancement; (a) original image [30], (b) linear stretched case, (c) morphological
octagonal enhanced case, (d) histogram equalized case; (left) transformed images, (middle) gray level on
the black line in the left hand images, (right) histogram of the left hand images
- 36 ­
Chapter 4
Signal and Image Processing
4.5 Segmentation Using Watershed
The watershed algorithm is a region-based segmentation technique. Usually, two properties of an image are
considered to segment it, edge and region. Watershed is used when the edge information is not good enough to
segment the image. The concept of the watershed algorithm originated from geology. It is also introduced in the
context of mathematical morphology.
Image data can be interpreted as a topographic surface where the gray levels represent altitudes. A catchment
basin is defined as the region where all points flow downhill to a common point. The high gradient region,
called watershed lines, corresponds to the high watersheds, and the low gradient region corresponds to the
catchment basins. If we consider a local region where all rainwater flows to a single location, this might not
seem to be applicable to intensity-based images, but it makes sense if the object is a gradient magnitude image.
4.5.1 Basic Concept
There are two basic approaches to the watershed image segmentation.
The first starts with finding a downstream path from each pixel of the image to a regional minimum. The
regional minimum is defined as a point, which does not have a descending path in its neighborhood. Assuming
two points s1 and s2 of a surface S, a descending path can be defined as a sequence {si} of points of S as
∀si ( xi , f ( xi )), s j (x j , f (x j )) i ≥ j ⇔ f ( xi ) ≤ f (x j ).
(4.50)
In other words, a point s ∈ S belongs to a minimum, if there is no existing downstream path starting from s.
A catchment basin is defined as a set of pixels for which their respective downstream paths all end up at the
same altitude minimum. Catchment basins of a topographic surface are homogeneous in the sense that all pixels
in the same catchment basin are connected with the minimum altitude by a simple path of pixels that have
monotonically decreasing altitude. Such catchment basins represent the regions of the segmented image.
However, no rules exist to define the downstream paths uniquely for digital surfaces, while the downstream
paths are easy to determine for continuous altitude surfaces by calculating the local gradients.
The second approach is dual to the first. Instead of identifying the downstream paths, the catchment basins are
filled from the bottom [21, 22]. There is a hole in each local minimum, and the topographic surface is immersed
in water step by step. If two catchment basins merge as a result of further immersion, a dam is built all the way
to the highest surface altitude. The dam represents the watershed line. When the flooding reaches the highest
level, only the dam, called the watershed line, has remained.
From this point in this report, the watershed algorithm is assumed to be this flooding process. More detail
about the flooding process will be introduced in Section 4.5.3 and Section 4.5.4.
- 37 ­
Chapter 4
Signal and Image Processing
4.5.2 Geodesic Functions
This section introduces some important mathematical operators for the watershed algorithm.
In the framework of digital pictures, a gray tone image can be represented by a function, f : Z 2 → Z . The
point of the space Z2 may be the vertices of a square or a hexagonal grid, and f(x) is the gray value of the image
at the point x. Every space will be assumed as Z2 from this point, unless other dimensions are mentioned.
A section of f at level i is defined as
X i ( f ) = { f ( x ) ≥ i},
(4.51)
Z i ( f ) = { f ( x ) ≤ i} .
(4.52)
X i ( f ) = Z ic−1 ( f ) .
(4.53)
and
They have a complementary relation as
The distance function of every point y of Y to the nearest point of Yc is
∀y ∈Y , d ( y ) = dist ( y, Y c ) ,
(4.54)
where Yc is the complementary set of Y.
A section of d at level i is given by
X i (d ) = {y : d ( y ) ≥ i} = Y ⊗ Bi ,
(4.55)
where Bi is a disk of radius i, and ⊗ means an erosion [22, 23].
(a)
(b)
Figure 4-12 An example of a distance function; (a) a binary image, (b) the distance function of (a)
Figure 4-12 is an example of the distance function. A set of points Y and the complementary set Yc are given as
the white and black areas in (a). The distance function of every point of Y to Yc can be presented as Figure 4-12
(b). The brightest area indicates the pixels, which have the maximum distance to the complementary set.
- 38 ­
Chapter 4
Signal and Image Processing
Geodesic distance is the distance between two points within the set where the two points belong. The geodesic
distance function, dX(x,y), is defined as the length of the shortest path between x and y, where both points exist in
the set X.
(a)
(b)
Figure 4-13 An example of geodesic distance function; (a) a set of points X and a point x, (b) geodesic distance
function from x within X
Figure 4-13 is an example of geodesic distance function. There is a point x in the set X in (a). The black dot
represents a pixel x, and the white H shape represents a set X. The geodesic distance function from the point x to
an arbitrary point y in the set X is represented as a gray level in (b). The brighter value represents the longer
distance. The dotted line indicates the same Euclidean distance. Since the paths toward the right part of H have
to take a bypass, the distance to the upper left part of H is relatively shorter than to the right part of H, though
the Euclidean distance is the same.
4.5.3 Reconstruction
Letting Y be any set, included in X, the set of all points of X at a finite geodesic distance from Y can be
computed as,
R X (Y ) = {x ∈ X : ∃y ∈ Y , d X ( x, y ) ≠ ∞} .
(4.56)
RX(Y) is called the X-reconstructed set by the marker set Y [23, 24]. It is made of all the connected components
of X, centered at Y.
Two gray image functions f and g are considered in the same way under the condition of f ≤ g. The
correspondent sections of these two functions at level i are Xi(g) and Xi(f). Since f ≤ g, Xi(f) is obviously included
in Xi(g). For every level i, a new set can be obtained by reconstructing Xi(g) using Xi(f) as a marker. The new
sets, R X i ( g ) ( X i ( f )) , define a pile of embedded sections of a new function, called the reconstruction of g by f,
and is denoted as Rg(f). The dual reconstruction of g by f, under the condition f ≥ g, is denoted as R*g(f) [23, 24].
It is obtained by reconstructing the sections Zi(g) using Zi(f) as a marker. Xi(f) and Zi(f) are complementary to
each other as equation (4.53).
- 39 ­
Chapter 4
Signal and Image Processing
This reconstruction and dual reconstruction process extracts the regional maximum and minimum.
(a)
(b)
(c)
(d)
Figure 4-14 Finding regional maxima and minima by reconstruction; (a) function f and f - 1, (b) reconstruction Rf(f1) and regional maxima kM(f), (c) function f , f + 1, and regional minima km(f), (d) reconstruction
R*f(f+1)
In order to find the regional maximum, the function f and f – 1 are overlapped. Figure 4-14 (a) is the vertical
slice of the overlapped functions. Then, the reconstruction of f using f - 1 as a marker is obtained as Rf(f-1). This
is the light gray area in Figure 4-14 (b). Since Figure 4-14 is a slice of a two-dimensional gray level image, the
actual shape of Rf(f-1) has a volume. The set of local maximum M(f) can be found by the difference between the
function f and Rf(f-1) [23] as
M ( f ) = f − R f ( f − 1) .
(4.57)
M(f) is presented as a set of binary data and as the dark gray area in Figure 4-14 (b).
k M ( f ) ( x ) = 1 , if x ∈ M ( f )
(4.58)
k M ( f ) ( x ) = 0 , if x ∉ M ( f )
(4.59)
For the regional minimum case, the function f and f + 1 are overlapped as Figure 4-14 (c). The dual
reconstruction of f using f + 1 as a marker is obtained as R*f(f+1). This is the light gray area in Figure 4-14 (d).
The set of regional minimum m(f) can be found by the difference between R*f(f+1) and f [23] as
- 40 ­
Chapter 4
Signal and Image Processing
m( f ) = R *f ( f + 1) − f ,
(4.60)
m(f) is presented as a set of binary data, the same as M(f) in Figure 4-14 (c).
k m ( f ) ( x ) = 1 , if x ∈ m( f )
(4.61)
k m ( f ) ( x ) = 0 , if x ∉ m( f )
(4.62)
These sets of regional maxima and minima will be used for markers for the marker-based watershed algorithm.
Let Y be composed of n connected components Yi. Then, the geodesic zone of influence of Yi is the set of
points of X that are at a finite geodesic distance from Yi and are closer to Yi than to any other Yj. The geodesic
zone of influence of Yi is denoted as zX(Yi) [22].
z X (Yi ) = {x ∈ X : d X ( x, Yi ) ≠ ∞, ∀j ≠ i, d X ( x, Yi ) < d X (x, Y j )}
(4.63)
The entire set of the zones of influence Y in X, IZX(Y), are defined as
IZ X (Y ) = ∪ z X (Yi ) .
(4.64)
i
The geodesic skeleton by the zones of influence of Y in X is obtained as the boundaries of zX(Yi) in the set X,
and it is denoted as, SKIZX(Y) [22]. This is defined as
SKIZ X (Y ) = X \ IZ X (Y ) ,
(4.65)
where \ means the set difference.
Figure 4-15 Geodesic SKIZ of a set Y included in X
In Figure 4-15, the light gray region is zX(Yi), the sets of the zones of influence Y in X. The narrow region,
which is not included in both zX(Y1) and zX(Y2) but in the upper set of X, is the SKIZ for the upper area, and the
region not included in both zX(Y3) and zX(Y4) but in the lower set of X is the SKIZ for the lower area.
The watershed transformation by flooding may be directly transposed into the method using the sections of the
function f. Figure 4-16 is the topological interpretation of Figure 4-15.
- 41 ­
Chapter 4
Signal and Image Processing
(a)
(b)
(c)
Figure 4-16 Watershed construction using a geodesic SKIZ1
There is a section Zi(f) of f at the level i, and the flood has reached the level i in Figure 4-16 (a). In the next step,
the flooding of Zi+1(f) is performed in the zones of influence of connected components of Zi(f). The SKIZ, which
are not included by any of Zi(f) but Zi+1(f), remains as a result of the flooding as shown in (b). Some connected
components of Zi+1(f), which have not been reached by the flood, are defined as minimum at the level i + 1 . This
is the white area in (a). This minimum should be added to the flooded area in (c).
The section at the level i of the catchment basins of f is obtained by
[
]
Wi +1 ( f ) = IZ Zi +1 ( f ) (Z i ( f )) ∪ mi +1 ( f ) ,
(4.66)
where mi(f) is the minima of the function at the level i [22].
IZ Zi +1 ( f ) (Z i ( f )) for Figure 4-16 is the gray area in (b) excluding the SKIZ.
The minima at level i + 1 are given by
mi +1 ( f ) = Z i +1 ( f ) \ RZ i +1 ( f ) (Z i ( f )) ,
(4.67)
where RZ i +1 ( f ) (Z i ( f )) is the reconstruction of Z i +1 ( f ) using Z i ( f ) as a marker.
Wi+1(f) for Figure 4-16 is the gray area in (c) excluding the boundary and SKIZ.
This iterative algorithm is initiated with W−1 ( f ) = ∅. At the end of the process, the watershed line DL(f) is
equal to the complementary set of the highest section of the catchment basins [22], and is defined as
DL( f ) = W Nc ( f ) ,
(4.68)
where max( f ) = N .
The watershed line for Figure 4-16 is the boundary line of (c) including the SKIZ.
4.5.4 General Watershed vs. Marker-Based Watershed
This section applies the general watershed algorithm and the marker-based algorithm to an image and
compares the results.
1
We perform flooding only on two levels, from i to i + 1, for convenience.
- 42 ­
Chapter 4
Signal and Image Processing
The gradient image is often used for the flooding object in the watershed transformation, because the main
criterion for segmentation is the homogeneity of gray value.
(a)
(b)
(c)
Figure 4-17 An example of watershed segmentation; (a) original image, (b) morphological gradient, (c) segmented
result
Figure 4-17 (a) is a synthesized image consisting of a few bright circles. The inside background has some gray
values, but the outside background has almost zero gray value. The goal is to separate these circles from the
background. The morphological gradient of (a) by equation (4.35) using the octagonal shaped SE sized 7×7 is
obtained as (b). The gradient image has high gray values on the edges between the circles and the background.
These edges will be the watershed lines during the flooding process.
(a)
(b)
Figure 4-18 Topological view of Figure 4-17; (a) original surface, (b) gradient surface
Figure 4-18 shows a topological view of the previous example. The bright circles in Figure 4-17 (a) can be
interpreted as hills in a topological manner shown in Figure 4-18 (a). The edge lines of Figure 4-17 (a) are
located in a high level in Figure 4-18 (b), the topological view of the gradient image. The plain area in Figure
4-17 (a) is located in sinks in Figure 4-18 (b). So, the edge and plain area of the original image can be
interpreted as local maxima and minima in the gradient image.
- 43 ­
Chapter 4
Signal and Image Processing
The flooding process begins from the local minima in Figure 4-18 (b). Each minimum mi(f) of the topographic
surface is pierced, and the whole surface is plunged into a lake with a constant vertical speed. When multiple
floods, from different sources, merge into one, a dam is built to avoid this event. At the end of this flooding,
only the dams will remain. These dams define the watershed of the function f, and separate the various
catchment basins, which contain one minimum for each of them. The flooding is not applied to the original
image but to the gradient image. After these steps, this process caused too many watershed lines in Figure 4-17
(c). This is over-segmentation and an undesired result.
The over-segmentation is a serious drawback of watershed segmentation. The over-segmentation is due to the
fact that every local minimum was considered to be the center of catchment basins. These minima are produced
by small variations or noise, and not all of them are important. The marker-based watershed algorithm is an
adaptive method to overcome this over-segmentation problem [22, 23, 24].
The main difference between the general flooding algorithm and the marker-based algorithm is the flooding
source. The marker-based watershed flooding grows from only the markers, while the general watershed
flooding grows from each local minimum. If enough information about the object image is provided, optimal
markers can be selected manually, otherwise the regional maxima and the SKIZ, obtained by flooding from the
regional maxima, are usually used as markers. Figure 4-19 shows the process of the marker-based watershed
algorithm.
Figure 4-19 Process of the marker-based watershed algorithm; (a) markers with the original image, (b) markers
with the gradient image, (c) segmented result
The first step for this work is to find the inside markers by finding the regional maxima of the original image.
The second step is to find the outside markers by finding the SKIZ of each catchment basin. The SKIZ will
correspond to the borders, originating in the regional maxima. This process is performed by flooding the
inverted original image from the inside markers. Figure 4-19 (a) shows both the inside and outside markers in
the original image. This is the result of the first flooding. The final step is to flood the gradient image from both
the inside and outside markers. (b) shows markers in the gradient image, and (c) shows the segmented result.
The following images are the topological view.
- 44 ­
Chapter 4
Signal and Image Processing
(a)
(b)
Figure 4-20 Topological view of the marker-based watershed algorithm; (a) markers on the original surface, (b) the
final watershed lines created on the gradient surface
Figure 4-20 (a) is the topological surface of Figure 4-19 (a). The peaks on each hill are the inside markers, and
the SKIZ (the borders of each hill region) become the outside markers. Although the outside markers may seem
messy in the center area, the minor SKIZ does not much affect the flooding result as shown in (b).
The marker problems, such as how to find markers and how to decide important markers, have been
determined to be the most important matter. The size of a marker does not affect its performance; only its
existence matters. The size of some of the inside markers is extremely small in Figure 4-19 (a) (b), and Figure
4-20 (a), but they performed the same role as other inside markers for their catchment basins.
- 45 ­
Chapter 5
Proposed Method – The Selection Rule
Chapter 5. Proposed Method – The Selection Rule
A method of finding existing targets from a set of candidates, obtained from the segmented features, is
proposed in this chapter. This is quite an experiment-based method rather than a theory-based one. The proposed
method has been considered as a tool for IR data processing.
This chapter explains how to obtain candidates and how to make a selection from the candidates. Two
examples with synthetic data are presented in this chapter, and the experiments with actual mine data will be
presented in Chapter 6.
5.1 Candidates
Prior to any decision, how to make a set of candidates should be explained. The candidates are obtained from
the segmented feature data. According to the concept of dynamic thermal application, the feature data can be
extracted by KLT from an IR image sequence, and the watershed processing can perform the segmentation.
Dynamic thermography was introduced in Section 3.2.2, and KLT in Section 4.2.1.
There are several ways to apply KLT to an IR image sequence.
A simple way is to apply KLT to the entire image sequence as one packet and to extract only the first
transformed image. The first transformed image becomes the feature data, but it is rare that a single image
provides enough information.
The ideal way is to apply KLT in the same way but to extract the second transformed image as well as the first.
In most cases, these two images provide enough information to find one or two targets.
If the first two images cannot provide enough information, we have to extract more transformed images. In this
case, a statistical method to distinguish the mine target from false alarms is required. Also, it should be noted
that the higher order transformed images have less information than the lower order images.
After feature extraction, segmentation is performed. The marker-based watershed algorithm will be used for
the experiments in Chapter 6. After segmentation, a set of candidates is selected according to their size or shape
from the segmented image set.
5.2 Selection Rule
The term ‘selection’ instead of ‘detection’ is used, because we actually select the target from a set of
candidates.
The goal of this selection process is to distinguish mine targets from false alarms. Unfortunately, there is not
any straight rule to decide which is a mine target and which is a false alarm. The idea is that mine targets should
be detected in every feature image, while false alarms should appear only once. This assumption is similar to
- 46 ­
Chapter 5
Proposed Method – The Selection Rule
KLT, in Section 4.2.1. The mine target is selected by the existence of candidates at the approximate location in
each feature image.
A simple case is that mine targets are shown in both feature images, while false alarms are shown only in one
of them. In that case, if the candidate appears in both feature images, it is considered as the mine target. The
logical AND function of the set of candidates can be the mathematical description.
(a)
(b)
(c)
Figure 5-1 An example of a simple selection; (a) candidates in the 1st feature image, (b) candidates in the 2nd feature
image, (c) selected target
Figure 5-1 shows an example of the simple case, consisting of two feature images and one target. The first
feature image has two candidates in (a) as {A, B}, and the second feature image has three candidates in (b) as
{C, D, E}. Since the candidate A and C exist in similar location in each image, they are defined as the same
candidate. The exact location or size is not important, because this is feature data as a set of candidates rather
than raw sensor data as a set of image pixels. The important information is only the existence of candidates at
the approximate location. Two sets of candidates, {A, B} and {A, D, E}, are given, and candidate A is selected
as the mine target in (c). The strength of the signal, corresponding to the size of each candidate, is not important.
The only matter considered here is the existence of the candidate. The candidate B has a strong signal in (a), but
is not considered as the mine target because it does not exist in (b). The mathematical expression can be derived
as
f ({A, B}, {C , D, E}) = {A, B} ∩ {C , D, E} ≅ {A} , if {C} ≈ {A} ,
(5.1)
where f(•) is the decision function.
A complicated case exists when many feature images are provided but mine targets are not shown in all feature
images because the target signal is relatively weak compared to the background. This situation can happen
frequently, because most APM, made of plastic or wood, do not have distinctive characteristics. In this case, the
statistical probability should be considered as a decision method because mine targets cannot be distinguished
from false alarms.
- 47 ­
Chapter 5
Proposed Method – The Selection Rule
(a)
(b)
(c)
(d)
Figure 5-2 An example of a complicated selection; (a) candidates in the 1st feature image, (b) candidates in the 2nd
feature image, (c) candidates in the 3rd feature image, (d) two selected mine targets
Figure 5-2 shows an example of the weak signal case. Three sets of candidates, {A, B}, {C, D, E}, {F, G, H},
are provided. If any candidates seem to be the same object by matching the entire set, they are marked as the
same candidate. Then, A ≈ C ≈ F and E ≈ H. The final independent set of candidates can be counted as shown in
Table 5-1.
Candidates
A
B
D
E
G
Feature 1 (a)
O
O
X
X
X
Feature 2 (b)
O
X
O
O
X
Feature 3 (c)
O
X
X
O
O
Probability
1
1/3
1/3
2/3
1/3
Decision (d)
O
X
X
O
X
Table 5-1 Statistical probability of candidates in Figure 5-2
Since candidate A is found in all feature images, A is selected as the mine target. Candidate E is found twice,
but other targets are found once. The matter of whether twice-found candidates should be considered as mine
targets or false alarms should be assisted by the statistical probability of previous experience. Regarding the
twice-found candidates as mine targets and the once-found candidates as false alarms, two mine targets are
selected as shown in Figure 5-2 (d).
Sometimes, candidates located on the boundary of feature images are ambiguous matter. The possibility of
false alarms from the boundary candidates is high, because filtering or segmentation cannot be performed
beyond the boundary. Thus, boundary candidates are not desired, but multiple sets of raw data from several
- 48 ­
Chapter 5
Proposed Method – The Selection Rule
positions can avoid locating the target on the boundary. It is recommended to disregard the boundary candidates
during the selection procedure in order to reduce the number of false alarms.1
As previously mentioned, there is no straight rule to confirm the existence of mine targets from an ambiguous
signal. An automatic selection function algorithm has not been provided yet. A reasonable selection should be
assisted by optimal choice of the feature image and by proper segmentation, prior to the selection process.
1
The experiments in Chapter 6 regard the boundary candidates as valid candidates. Therefore, some false alarms are
expected.
- 49 ­
Chapter 6
Experiments
Chapter 6. Experiments
The object of this chapter is to implement and test the proposed method and the image processing techniques
introduced in this report with actual mine data. Most IR related image processing methods, introduced in Section
3.2 and Chapter 4 are involved. In order to show the result of each image-processing method, each step is
presented in the form of an image, even though this may seem to be a repeating step.
The data source and detailed information of each set of data are described prior to each experiment summary.
Most data for these experiments are downloaded from the signature database managed by the Joint Research
Center1 and the Unexploded Ordnance Center2.
In Section 6.1, the entire procedure is briefly described. The actual experiments will be presented in the
following sections as case studies.
6.1 Procedure
The reason why only the IR image sequences are considered as feature data is because the targets are buried
underground. The target signal is very weak in this situation, so there is no guarantee that the targets will be
discovered by a single measurement. The existence of the target appears very ambiguous in the raw data set.
Sometimes, even noise is more dominant than the target signal. Thus, the correlation or covariance of pixels in
an image sequence is more useful for feature data than the pixel value of a single image.
The proposed application consists of five stages.
At the first stage, the feature images are extracted from the given image sequence by KLT, introduced in
Section 4.2.1. Then, linear stretching or histogram equalization, introduced in Section 4.4.2, enhances the
contrast. Since smoothly varied contrast is desired for segmentation, linear stretching would generally be used
unless a robust feature is necessary.
At the second stage, ASF, introduced in Section 4.3.3, is applied to the result of the first stage to remove small
clutter and noise. Since the filtering intends to squash peaks and valleys at this stage, critical filtering may be
applied to achieve a feature image with a homogeneous gray level variation. This homogeneous variation is
required by the segmentation in the next stage.
At the third stage, the marker-based watershed algorithm, introduced in Section 4.5.4, is applied for the
segmentation. The morphological image functions, in the second and third stage, use the circle shaped SE,
1
Joint Research Center: the European Union’s scientific and technical research laboratories, located in Belgium, Germany,
Italy, the Netherlands, and Spain. They manage a landmine signature database at http://apl-database.jrc.it .
2
Unexploded Ordnance Center: mine detection research organization sponsored by the Department of Defense, USA.
- 50 ­
Chapter 6
Experiments
introduced in Section 4.3, because mine targets are expected to have a round shape. The expected result is a set
of segmented parts in the form of an image.
At the fourth stage, the segmented sets are labeled for the discrimination process. If a labeled set satisfies the
condition to be a candidate, this set is assigned as a candidate. Although there is not a definite rule, some rule
can be determined from the property of targets. Firstly, the shape of a mine target is expected to be round in
most cases. Secondly, some sets can be excluded by their size if the actual scale of the image is provided.
At the final stage, the candidates are matched with each other and selected as the mine target by the selection
rule, introduced in Section 5.2. The goal of this work is to discriminate the mine targets from false alarms. The
desired result is two distinctive sets, false alarms and mine targets.
- 51 ­
Chapter 6
Experiments
6.2 Case Study 1
The Royal Military Academy (RMA) built a test minefield in Meerdael, Belgium. They performed a
measurement campaign with the Belgian Army, and JRC has managed the signature database. Two kinds of IR
sensors were used, AGEMA and TICM2. The AGEMA sensor is available at the 3 ~ 5µm band, and the TICM2
at 8 ~ 12µm. Various types of soil environments were measured, and two of them, gravel and sand, are tested in
this report. The ground truth data was not provided from the database, but literature published by RMA, [25],
reported the approximate location and the number of mines.
Table 6-1 profiles the site specifications of the data for case studies 1 to 3.
Collector
Minefield Location
RMA
Meerdael, Belgium
Soil Condition
Sensor Type
Gravel,
AGEMA (3-5µm),
Sand
TICM2 (8-12µm)
Table 6-1 Site specifications of Meerdael test minefield [30]
The data of case study 1 were collected with the AGEMA sensor at the gravel field. Figure 6-1 shows sampled
images, and Table 6-2 profiles the data specification. The data set consists of 48 images, taken at intervals of 30
minutes during a 24 hour period with size 256×256. The cell shaped texture comes from the gravel.
Number of Targets
Date and Time
Number of Frames
2
April 2, 11:50 ~ April 3, 11:30, 1998
48 (1 per 30 minutes)
Table 6-2 Data specification acquired with the AGEMA sensor at the gravel field [30], [25]
(a)
(b)
(c)
(d)
Figure 6-1 Sampled images from the data set acquired with the AGEMA sensor at the gravel field [30]; images
taken at (a) 17:47 (b) 23:47 (c) 05:50 (d) 11:30
- 52 ­
Chapter 6
Experiments
(a)
(b)
(c)
(d)
Figure 6-2 Procedure for the 1st feature; (a) 1st transformed image by KLT, (b) filtered image by the white ASF with
SE sized from 3×3 to 35×35, (c) inside and outside markers, (d) gradient image of (b)1
(a)
(b)
(c)
(d)
Figure 6-3 Procedure for the 2nd feature; (a) 2nd transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 3×3 to 35×35, (c) inside and outside markers, (d) gradient image of (b)
Since it is very difficult to extract useful information from this raw data set, feature extraction is required. KLT
was used to extract feature images, and linear stretching was used for post processing to enhance the contrast.
The first and second transformed images are selected as feature data, as shown in Figure 6-2 (a) and Figure 6-3
(a). Then, ASF is applied to the two transformed images to remove noise and squash the gravel texture. The
white ASF with the round shaped SE sized from 3×3 to 35×35 results in two images, Figure 6-2 (b) and Figure
6-3 (b). The double marker-based watershed requires two sets of markers, inside and outside. The inside
markers are found as the regional minima of the filtered image. The outside markers are found by the watershed
process of the filtered image, flooded from the inside marker. The white area indicates the inside markers, and
the lines indicate the outside markers in Figure 6-2 (c) and Figure 6-3 (c). The morphological gradient of the
filtered image is calculated as Figure 6-2 (d) and Figure 6-3 (d).
1
The contrast of the gradient image is enhanced to show the feature clearly. The actual gradient data used in the experiment
is not enhanced. This is also true for the following experiments.
- 53 ­
Chapter 6
Experiments
(a)
(b)
(c)
Figure 6-4 Results from the 1st feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a)
(b)
(c)
Figure 6-5 Results from the 2nd feature; (a) segmented result, (b) labeled set, (c) selected candidates
Now, the feature data are ready for segmentation. After flooding the gradient image from the markers, the
object images are segmented as shown in Figure 6-4 (a) and Figure 6-5 (a). The segmented sets are labeled as
Figure 6-4 (b) and Figure 6-5 (b). Although there is no definite rule to make a decision, the candidates can be
discriminated by size and shape from the segmented set. If a potential candidate is too big, too small, or the
shape is too far from a round shape, it is excluded from the final candidates. The final set of candidates for each
feature image is shown in Figure 6-4 (c) and Figure 6-5 (c). Counting from left to right and top to bottom, the
candidates are arranged as two sets, {A, B, C, D, E, F} in Figure 6-4 (c) and {G, H, I, J} in Figure 6-5 (c).
Considering the existence of candidates at approximate locations, we can determine that B ≈ G, D ≈ H, E ≈ J.
Those candidates are selected as the mine targets in Figure 6-6 (a).
- 54 ­
Chapter 6
Experiments
(a)
(b)
(c)
Figure 6-6 The mine targets; (a) selected mine targets, (b) actual mine targets [25], (c) false alarm
Figure 6-6 (a) presents the selected mine targets, and (b) shows the actual mine targets from the literature
published by RMA. Two mines were found successfully, but a false alarm occurred as shown in (c).
- 55 ­
Chapter 6
Experiments
6.3 Case Study 2
The data of case study 2 were collected with the same sensor as in case study 1, but the soil condition was sand.
Figure 6-7 shows some sampled images, and Table 6-3 profiles the data specifications. The data set consists of
44 images taken at intervals of 30 minutes during a 24 hour period with size 256×256. This data set has a
relatively smoother texture than the previous gravel case.
Number of Targets
Date and Time
Number of Frames1
1
April 1, 13:08 ~ April 2, 11:04, 1998
44 (1 per 30 minutes)
Table 6-3 Data specifications acquired with the AGEMA sensor at the sand field [30], [25]
(a)
(b)
(c)
(d)
Figure 6-7 Sampled images from the data set acquired with the AGEMA sensor at the sand field [30]; images taken
at (a) 17:38, (b) 23:05, (c) 04:35, (d) 11:04
1
Some poor images were eliminated.
- 56 ­
Chapter 6
Experiments
(a)
(b)
(c)
(d)
Figure 6-8 Procedure for the 1st feature; (a) 1st transformed image by KLT, (b) filtered image by the white ASF with
SE sized from 3×3 to 35×35, (c) inside and outside markers, (d) gradient image of (b)
(a)
(b)
(c)
(d)
Figure 6-9 Procedure for the 2nd feature; (a) 2nd transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 3×3 to 35×35, (c) inside and outside markers, (d) gradient image of (b)
The application procedure is the same as previously stated. Two feature images were extracted by KLT as
shown in Figure 6-8 (a) and Figure 6-9 (a). The white ASF, with the round shaped SE sized from 3×3 to 35×35,
was applied to the two KLT transformed images in Figure 6-8 (b) and Figure 6-9 (b). The inside and outside
markers were found as shown in Figure 6-8 (c) and Figure 6-9 (c). Figure 6-8 (d) and Figure 6-9 (d) are the
gradient images.
After the watershed segmentation, Figure 6-10 (a) and Figure 6-11 (a) are achieved.
- 57 ­
Chapter 6
Experiments
(a)
(b)
(c)
Figure 6-10 Results from the 1st feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a)
(b)
(c)
Figure 6-11 Results from the 2nd feature; (a) segmented result, (b) labeled set, (c) selected candidates
The segmented sets are labeled as shown in Figure 6-10 (b) and Figure 6-11 (b). Figure 6-10 (c) and Figure
6-11 (c) show the selected candidates from the labeled set.
Counting from the left to right and top to bottm in Figure 6-10 (c) and Figure 6-11 (c), the candidate set is
obtained as {A, B, C, D, E, F} for the first feature and {G, H} for the second feature. Matching the candidates
by approximate location, we can determine that D ≈ G. The selected mine target is matched with the actual mine
target from related literature [25] without any false alarm.
Figure 6-12 The selected mine target without any false alarm
- 58 ­
Chapter 6
Experiments
6.4 Case Study 3
The data for case study 3 were collected with the TICM2 sensor at the gravel field. Figure 6-13 shows some
sampled images, and Table 6-4 profiles the data specifications.
(a)
(b)
(c)
Figure 6-13 Sampled images from the data set with the TICM2 sensor at the gravel field [30]; images taken at (a)
17:30 (b) 23:15 (c) 05:00
The data set consists of 94 images taken at intervals of 15 minutes during a 24 hour period with size 520×340.
Similar texture to case study 1 is noted.
Number of Targets
Date and Time
Number of Frames
2
April 2, 11:45 ~ April 3, 11:00, 1998
94 (1 per 15 minutes)
Table 6-4 Data specifications at the gravel field with the TICM2 sensor [30], [25]
- 59 ­
Chapter 6
Experiments
(a)
(b)
(c)
Figure 6-14 Procedure for the 1st feature; (a) 1st transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 3×3 to 55×55, (c) inside and outside markers
(a)
(b)
(c)
Figure 6-15 Procedure for the 2nd feature; (a) 2nd transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 3×3 to 55×55, (c) inside and outside markers
After KLT application, Figure 6-14 (a) and Figure 6-15 (a) were extracted as the first and second transformed
imaged. Since the texture of gravel is too dominant in the overall image, the feature images need critical filtering.
The white ASF, with the round shaped SE sized up to 55×55, filtered the object image to Figure 6-14 (b) and
Figure 6-15 (b). The inside markers of the first feature were obtained by the regional maxima, while those of the
second feature were obtained by the regional minima. This happened because the relative intensity of the target
signal against the background is opposite in the first and second feature images. The markers are presented in
Figure 6-14 (c) and Figure 6-15 (c).
- 60 ­
Chapter 6
Experiments
(a)
(b)
(c)
Figure 6-16 Results from the 1st feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a)
(b)
(c)
Figure 6-17 Results from the 2nd feature; (a) segmented result, (b) labeled set, (c) selected candidates
The watershed flooding segmented the object image into Figure 6-16 (a) and Figure 6-17 (a). Figure 6-16 (b)
and Figure 6-17 (b) show the labeled set, and Figure 6-16 (c) and Figure 6-17 (c) show the selected candidates.
Counting from left to right and top to bottom, the candidate set was obtained as two sets, {A, B, C, D, E, F} for
Figure 6-16 and {G, H, I} for Figure 6-17. Matching the two sets by approximate location, we can determine
that C ≈ G and D ≈ I. Thus, these two candidates were selected as the mine targets as shown in Figure 6-18 (a),
but the actual mine targets are H as well as G in (b). Also, candidate I turned out to be a false alarm.
(a)
(b)
(c)
Figure 6-18 The mine targets; (a) selected mine targets, (b) actual mine targets [25], (c) failed mine target
This is an example of a failed case. Figure 6-18 (c) is the failed mine target. This happened because the first
transformed image of KLT did not show the existence of the missed mine; only the second transformed image
did. This is a good example of the fact that the first transformed KLT image cannot guarantee discovery of mine
- 61 ­
Chapter 6
Experiments
targets even though the image shows the strongest target signal. Sometimes, the features of the higher order are
equal or even higher in importance than the lower order features.
According to the literature, detected target G was buried recently, but undetected target H had been buried for a
long time [25]. The surface effect had disappeared when the measurement was performed. The surface effect,
introduced in Section 3.2.2, occurred by soil disturbance when the ground was dug to bury the mine.
- 62 ­
Chapter 6
Experiments
6.5 Case Study 4
The previous cases were simple cases with one or two mines, but this case is very complicated.
A commercial company, E-OIR, performed a measurement campaign with their sensor at the test minefield in
Fort Belvoir, Virginia.
Table 6-5 profiles the site specifications. The soil condition was earth instead of gravel or sand, and the earth
was covered by light grass. The Amber Galileo LWR IR sensor was used, which is available at the 8 ~ 9µm
band. The images were taken from a far distance to provide a large view. All conditions of this measurement
were quite close to an actual mine detection situation. Some undesirable texture, caused by irregular earth
component and grass, was expected.
Collector
Minefield Location
Soil Condition
Sensor Type
E-OIR
Ft. Belvoir, VA USA
Earth covered by light grass
Amber Galileo LWR (8-9µm)
Table 6-5 Site specifications of Ft. Belvoir test minefield [29]
Table 6-6 profiles the data specifications. This image sequence includes seven mine targets. 96 images were
taken at intervals of 15 minutes during a 24 hour period with size 222×140.
Number of Targets
Date and Time
Number of Frames
7
April 13, 15:00 ~ April 14, 14:45, 1998
96 (1 per 15 minutes)
Table 6-6 Data specifications by the Amber LWR sensor at the earth field covered by grass [29]
- 63 ­
Chapter 6
Experiments
(a)
(b)
(c)
Figure 6-19 Sampled images from the data set acquired with the Amber LWR sensor at the earth field in Ft. Belvoir
[29]; images taken at (a) 15:00, (b) 0:45, (c) 05:45
Mine Type
M-15
M-19
PGMDM
RAAM
FFV-028
TM-62
VS 1.6
Position
57,30
129,36
181,22
212,22
65,89
121,89
194,92
Condition
Buried
Buried
Surface
Surface
Buried
Buried
Buried
Table 6-7 Ground truth data of targets [29]
Figure 6-19 shows sampled images, and Table 6-7 presents the ground truth data. Except for the RAAM, every
target is an ATM. Two targets, the PGMDM and the RAAM, were laid on the surface, and the other targets were
buried underground. Figure 6-19 (a) presents the location of mines as white pixels. The mines are laid on two
lines. Four mines are laid on the first line, M-15, M-19 PGMDM, and RAAM, counting from left to right. Three
mines are laid on the second line, FFV-028, TM-62, and VS 1.6, counting from left to right. These mines are
named A, B, C, D, E, F, and G. As mentioned in Table 6-7, target C and D were laid on the surface, while the
other five targets were buried underground.
Since the distance is too far, and the targets seem too small, the disturbance of soil thermal condition around
the mines is expected to be stronger than the mine itself.
- 64 ­
Chapter 6
Experiments
(a)
(b)
(c)
Figure 6-20 Procedure for the 1st feature; (a) 1st transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 3×3 to 19×19, (c) inside and outside markers
(a)
(b)
(c)
Figure 6-21 Procedure for the 2nd feature; (a) 2nd transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 3×3 to 11×11, (c) inside and outside markers
(a)
(b)
(c)
Figure 6-22 Procedure for the 3rd feature; (a) 3rd transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 3×3 to 9×9, (c) inside and outside markers
(a)
(b)
(c)
Figure 6-23 Procedure for the 4th feature; (a) 4th transformed image by KLT, (b) filtered image by the white ASF
with SE sized from 3×3 to 11×11, (c) inside and outside markers
- 65 ­
Chapter 6
Experiments
More feature data are necessary in this case, because the target signal is relatively weaker than other cases and
the number of expected targets is high. Four feature images are extracted from the first four transformed images
of KLT. After KLT application, linear stretching was used for the previous cases, but histogram equalization,
introduced in Section 4.4.2, was used for more robust features in this case. Sometimes, histogram equalization is
useful to enhance a weak feature from the background. The result of KLT and histogram equalization is (a) from
Figure 6-20 to Figure 6-23.
The actual shapes of mine targets are seen in Figure 6-21 (a). Glancing over the area of target A, there is a
black circle in the white circle on the first line. The black circle indicates the actual size of target A by the
volume effect. The white circle is the area affected by the surface effect. Usually, both effects are useful in
finding mines, but the surface effect cannot occur from targets on the surface.
Critical filtering, the white ASF with SE sized up to 19×19, was applied to the first feature. The fivc dark areas,
caused by the surface effect, are clearly noticeable in Figure 6-20 (a). Targets C and D do not affect the thermal
condition of soil because they are laid on the ground. In order to extract the relatively small target signal from
targets C and D, the white ASF, with the SE sized up to 9×9 or 11×11, was applied to other features. Those
results are shown in (b) from Figure 6-20 to Figure 6-23.
Depending on whether the expected target signal is positive or negative, the regional maxima are extracted for
the first and fourth feature, and the regional minima are extracted for the second and third feature as inside
markers. Outside markers can be extracted from the inside markers by the flooding process. The markers are
presented in (c) from Figure 6-20 to Figure 6-23.
- 66 ­
Chapter 6
Experiments
(a)
(b)
(c)
Figure 6-24 Results from the 1st feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a)
(b)
(c)
Figure 6-25 Results from the 2nd feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a)
(b)
(c)
Figure 6-26 Results from the 3rd feature; (a) segmented result, (b) labeled set, (c) selected candidates
(a)
(b)
(c)
Figure 6-27 Results from the 4th feature; (a) segmented result, (b) labeled set, (c) selected candidates
Watershed processing was performed for each feature, and the results are presented in (a) from Figure 6-24 to
Figure 6-27. The segmented sets were labeled as shown in (b) from Figure 6-24 to Figure 6-27. (c) from Figure
- 67 ­
Chapter 6
Experiments
6-24 to Figure 6-27 show the selected set of candidates. Because of the high number of expected targets and
feature images, a lot of candidates were selected, but many of them appear to be false alarms. Four sets of
candidates are given. The first set has 6 candidates, second 8 candidates, third 20 candidates, and fourth 10
candidates.
Since many candidates are indicated in this case, we did not name each candidate but show the results in a
table. Table 6-8 represents the distribution of candidates by the targets and false alarms.
Targets
A
B
C
D
E
F
G
False
Total
1st
O
O
X
X
O
O
O
1
6
2nd
O
O
X
X
O
O×21
O
2
8
3rd
O
O
O
O
X
O
X
15
20
th
O
O
O
O
O
O
X
4
10
Total
4
4
2
2
3
4
2
22
44
Probability
4/4
4/4
2/4
2/4
3/4
4/4
2/4
¼
Decision (2/4)
O
O
O
O
O
O
O
X
4
Table 6-8 Distribution of Candidates for Case Study 4
The buried targets, A, B, E, F, appeared very frequently, but the surface laid targets, C and D, appeared only
twice. It may sound strange that a visible surface-laid target is unnoticeable, but a buried target is noticeable, but
that happened in this case. This is caused by the lack of surface effect. The twice-captured signals, from the
surface targets, are made by direct IR radiation from the surface of the mine itself.
(a)
(b)
Figure 6-28 The mine targets; (a) selected mines without false alarm, (b) ground truth [29]
22 out of 44 candidates turned out to be false candidates. They were selected as candidates but turned out not
to be actual mine targets. Every false candidate occurred only once at the same location. If the decision level of
1
In Figure 6-25 (c), the second and third candidates on the second line indicate the soil disturbance around target F. Even
though both candidates do not present the mine target directly, their signal was made by the target F. So, these candidates
are counted as the target F.
- 68 ­
Chapter 6
Experiments
probability is assigned as 2/4, every mine target is selected, and no false alarm has occurred. The selected mine
targets are shown in Figure 6-28 (a). By matching these ground truth, a successful location of 7 mine targets is
confirmed. This is a very good result.
- 69 ­
Chapter 7
Conclusions
Chapter 7. Conclusions
Two areas have been studied in the concept of mine detection, sensor technology and image processing. The
image processing area has received more attention in this report.
Various image-processing methods for mine detection have been studied such as filtering, feature extraction,
morphology, contrast enhancement, segmentation, and visualization. Also, a method to find a mine target from
multiple candidates has been proposed. All methods were tested with actual mine data.
The most serious problem in mine detection applications is the ambiguity of the target signal. In order to
overcome this ambiguity problem, most research has been conducted in two ways. Firstly, methods to extract
multiple signals from a source or to enhance the ambiguous signal to a noticeable level have been studied.
Among the introduced image processing methods in this report, filtering, feature extraction, contrast
enhancement, and visualization would be examples. Secondly, many research groups have developed a new
detection device with multiple sensors, called sensor fusion. Of course, the development of new kinds of sensors
is another possibility.
Since improvement in the image processing level is limited unless the sensor provides good information about
the target, the development of an improved detection device should be done prior to implementing the improved
image processing method. The current trend of the next generation of detection devices is towards an armored
vehicle or a portable unit with multiple sensors.
Future work in the image processing area will also involve fusion. A global method, able to accept data from
multiple sensors and to visualize them by the same concept, will be the next generation of image processing
methods for mine detection applications. For example, accepting target signals from IR, MD, and GPR, then
visualizing the actual shape of the target on the screen after analyzing all sensor data would be one solution.
Although some image-processing methods, referred to in this report, are very sensor-related ones, most can be
used with future generation sensors, which will appear soon or have already appeared.
- 70 ­
Chapter 8
References
Chapter 8. References
1. UN Landmine Database in the UN Mine Action Service; http://www.un.org/Depts/dpko/mine.
2. “Hidden Killers 1998: The Global Landmine Crisis,” US Department of State, Bureau of Political-Military
Affairs, Office of Humanitarian Demining Programs, Sep. 1998.
3. A. Sieber, “Localization and Identification of Anti-Personnel Mines,” European Commission Joint Research
Center International Workshop, Nov. 1995.
4. Landmine Database of the Norwegian Peoples Aid Mine Actions in Angola; http://www.angola.npaid.org/.
5. L. Kempen, “Physical Principles for Anti-Personnel Mine Detection: a Survey of Three Sensing Principles,”
Technical Report, IRIS-TR-0047, Dept. of Electronics and Information Processing, Vrije Universiteit
Brussel, May 1997.
6. R. Ekstein, “Anti-Personal Mine Detection Signal Processing and Detection Principles,” Master Thesis,
Dept. of Electronics and Information Processing, Vrije Universiteit Brussel, 1997.
7. L. Kempen and H. Sahli, “Ground Penetrating Radar Data Processing: a Selective Survey of the State of the
Art Literature,” Technical Report, IRIS-TR-0060, Dept. of Electronics and Information Processing, Vrije
Universiteit Brussel, Jan. 1999.
8. M. Acheroy, M. Piette, Y. Baudoin, and J. Salmon, “Belgian Project on Humanitarian Demining (HUDEM)
Sensor Design and Signal Processing Aspects,” Jul. 2000.
9. J. Brooks, L. Kempen, and H. Sahli, “Ground Penetration Radar Data Processing: Clutter Characterization
and Removal,” Technical Report, IRIS-TR-0059, Dept. of Electronics and Information Processing, Vrije
Universiteit Brussel, Jan. 1999.
10. L. Kempen, A. Katarzin, Y. Pizurion, C. Corneli, and H. Sahli, “Digital Signal/Image Processing for Mine
Detection, Part 2: Ground based Approach,” in Proceedings Euro Conference on Sensor Systems and Signal
Processing Techniques applied to the Detection of Mines and Unexploded Ordnance, pp. 54-59, Oct. 1999.
11. G. Ederra, “Mathematical Morphology Techniques Applied to Anti-Personnel Mine Detection,” Master
Thesis, Dept. of Electronics and Information Processing, Vrije Universiteit Brussel, 1999.
12. C. Bruschini and B. Gros, “A Survey of Current Sensor Technology Research for the Detection of
Landmines,” in Proceedings the International Workshop on Sustainable Humanitarian Demining, vol. 6, pp.
18-27, Sep. 1997.
- 71 ­
Chapter 8
References
13. L. Kempen, M. Kaczmarec, H. Sahli, and J. Cornelis, “Dynamic Infrared Image Sequence Analysis for Anti
Personnel Mine Detection,” in Proceedings IEEE Benelux Signal Processing Chapter, Signal Processing
Symposium, pp. 215-218, Mar. 1998.
14. L. Peters Jr., J. Daniels, and J. Young, “Ground Penetrating Radar as Subsurface Environmental Sensing
Tools,” in Proceedings IEEE International Conference, vol. 82, no. 12, pp. 1802-1822, Dec. 1994.
15. J. E. McFee and Y. Das, “Advances in the Location and Identification of Hidden Explosive Munitions,”
Defense Research Establishment Suffield, no. 548, pp. 83, Feb. 1991.
16. K. Russell, J. McFee, and W. Sirovyak, “Remote Performance Prediction for Infrared Imaging of Buried
Mines,” in Proceedings SPIE Detection and Remediation Technologies for Mines and Minelike Targets II,
vol. 3079, pp. 762-769, 1997.
17. Thermal Neutron Analysis, Ancore Inc.; http://www.ancore.com.
18. R. Gonzalez and R. Woods, Digital Image Processing, Addison Wesley, 1992.
19. S. Theodoridis and K. Koutroumbas, Pattern Recognition, Academic Press, 1998.
20. H. Heijimans, Morphological Image Operators, Academic Press, 1994.
21. S. Beucher and C. Lantuejoul, “Use of Watershed in Contour Detection,” in Proceedings International
Workshop on Image Processing: Real Time Edge and Motion Detection and Estimation, Sep. 1979.
22. S. Beucher, “The Watershed Transformation applied to Image Segmentation,” in Proceedings 10th
Conference on Signal and Image Processing in Microscopy and Microanalysis, Sep. 1991.
23. E. Dougherty, Mathematical Morphology in Image Processing, Marcel Dekker, 1992.
24. J. Roerdink and A. Meijster, “The Watershed Transform: Definitions, Algorithms, and Parallel Strategies,”
Fundamenta Informaticae, vol. 41, pp. 187-228, 2000.
25. P. Verlinde, M. Acheroy, and Y. Baudoin. “The Belgian Humanitarian Demining Project (HUDEM) and the
European Research Context,” in Proceedings Chiba University Workshop on Humanitarian Demining, Apr.
2001.
26. P. Machler, “Detection Technologies for Anti-Personnel Mines,” in Proceedings Symposium on
Autonomous Vehicles in Mine Countermeasures, vol. 6, pp. 150-154, Apr. 1995.
27. M. Schachne, L. Kempen, D. Milojevic, H. Sahli, Ph. Ham, M. Acheroy, and J. Cornelis, “Mine Detection
by Means of Dynamic Thermography: Simulation and Experiments,” in Proceedings IEE 2nd International
Conference on the Detection of Abandoned Landmines, pp. 124-128, Oct. 1998.
- 72 ­
Chapter 8
References
28. UWBGPR measurement at the Royal Military Academy, Belgium on May 31, 1999.
29. Ft. Belvoir Minefield in Virginia, USA. Collected by E-OIR on Jan. 13-14, 1998.
30. Meerdaal Test Minefield in Belgium. Collected by the Royal Military Academy on Apr. 1-3, 1998.
- 73 ­
Download