Surface Detection and Object Recognition in... Three-Dimensional Ultrasonic Imaging System

advertisement
Surface Detection and Object Recognition in a Real-Time
Three-Dimensional Ultrasonic Imaging System
By
Daniel Charles Letzler
Submitted to the Department of Electrical Engineering and Computer Science in Partial
Fulfillment of the Requirements for the Degree Master of Engineering in Electrical
Engineering and Computer Science
at the
MASSACHUSETTTS INSTITUTE OF TECHNOLOGY
MASSACHUSE
OF TECh
July 20, 1999
@ 1999 Daniel C. Letzler. All rights reserved.
The author hereby grants to MIT permission to reproduce and distribute publicly paper
and electronic copies of this thesis and to grant others the right to do so.
Author
-
/
Depa.
.-
ien
f Electrical Engineering and Computer Science
July 20, 1998
Certified b V
Dan Dudgeon
Thesis Supervisor
Accepted by
Arthur C. Smith
Chairman, Department Committee on Graduate Theses
Surface Detection and Object Recognition in a Real-Time
Three-Dimensional Ultrasonic Imaging System
By
Daniel Charles Letzler
Submitted to the Department of Electrical Engineering and Computer Science
July 20, 1999
In Partial Fulfillment of the Requirements for the Degree Master of Engineering in Electrical
Engineering and Computer Science
Abstract
A real-time three-dimensional acoustical imager is currently under development at Lockheed
Martin IR Imaging Systems.
This thesis first presents a brief overview of the hardware and
software components of this system.
Second, an algorithm capable of performing real-time,
software-based surface detection from the acoustical data on a set of digital signal processing
chips is presented.
Third, the similarities and differences between the not yet operational
acoustical imager and an experimental system at Lockheed Martin are explored.
Fourth, an
object recognition prototype is developed using data from this experimental system and found to
have a great deal of discriminatory power.
Based upon the comparison and contrast of the
imager and the experimental system, it is then asserted that the acoustical imager should be
capable of similar performance. Finally, suggestions for the implementation of such an object
recognition system on an acoustical imager are presented.
Thesis Supervisor: Dan Dudgeon
Title: Senior Staff Member, Massachusetts Institute of Technology Lincoln Laboratory
2
Table of Contents
8
8
8
1
Introduction
1.1 Acoustical imaging
1.2 Thesis overview
2
System components
2.1 Acoustical lens
11
12
2.2 Transducer Hybrid Assembly
2.3 Acoustical Imaging Module
13
13
2.4
2.5
2.6
2.7
13
14
15
15
Image Interface Board
Digital Signal Processor Image Processing Board
Liquid crystal display screen
Host computer
3
System software description
3.1 Common framework
3.2 Communications
3.3 Data acquisition processor
3.4 Peak detection processors
3.5 Video display processor
16
16
17
18
19
19
4
Surface detection algorithm
4.1 Introduction
21
21
4.1.1 Brief algorithm description _21
4.1.2 Algorithm justification
4.2 Background
21
23
4.2.1
4.2.2
4.2.3
4.2.4
4.2.5
23
23
23
27
27
4.3 ADSP-21060 SHARC hardware considerations
28
4.4 Required preprocessing
4.4.1
Requirements
4.4.2
Expected method of preprocessing
Alternate preprocessing by the input DSP chip
4.4.3
4.5 The peak detection algorithm
32
32
32
32
33
English description of the algorithm
Block diagram of the algorithm
33
36
4.5.1
4.5.2
5
Terminology and notational conventions
Definition of a peak
Expected form of data for peak detecting DSP chips
Memory requirements for each of peak detecting DSP chips
Potential first peak data structure
4.5.3 Illustration of the algorithm
4.6 Proof of algorithm correctness
4.7 Analysis of computational burden of algorithm
4.8 Analysis of communications burden of algorithm
4.9 Conclusion
37
38
41
44
45
Analysis of data gatheredby the Acoustical Imaging System
5.1 Introduction
5.2 Assumption of linear time invariant system performance
5.3 System dynamic range analysis
5.4 Experimental set-up: explanation, justification, and relation to AIS
46
46
46
48
49
3
6
5.4.1
Data acquisition
5.4.2
Target selection
5.4.3
Target presentation
5.4.4
Experimental set-up justification
5.4.5
Relation of experimental system to Acoustical Imaging System
5.5 Conclusion
50
50
51
52
54
59
Object recognition feasibility study
6.1 Introduction
6.2 Dependence of acquired data on viewpoint
6.3 Object recognition feature selection
6.3.1
Material-based classification features
61
61
61
70
71
6.3.1.1 Thick-shelled targets
71
6.3.1.2 Thin-shelled targets
6.3.1.3 Cylindrical targets
6.3.2
Structure-based classification features
6.3.2.1 Thick-shelled targets
6.3.2.2 Thin-shelled targets
6.3.2.3 Cylindrical targets
6.3.3
78
78
79
79
81
84
Frequency domain-based classification features
6.3.3.1 Thick-shelled targets
6.3.3.2 Thin-shelled targets
86
88
89
6.3.3.3 Cylindrical targets
89
6.3.4
Summary of target classification criteria
89
6.4 Object recognition prototype presentation
90
6.5 Object recognition prototype performance
93
6.5.1
OR performance: closed problem space, highly controlled orientation, no noise
added
6.5.2
6.5.3
93
OR performance: closed problem space, highly controlled orientation, Gaussian
noise added
95
OR performance: closed problem space, loosely controlled orientation, no noise
added
6.5.4
7
8
OR performance:
98
open problem space, highly controlled orientation, no noise
added
6.6 Conclusion
100
104
Object recognition implementation suggestions
7.1 Introduction
105
105
7.2 Sample images
105
7.3
7.4
7.5
7.6
7.7
107
108
110
110
111
Target identification
Use of an ensemble of time series to improve object recognition
Image-level object recognition
Weighting of classification features
Acoustical Imaging System design suggestions
7.8 Conclusion
112
Conclusion
114
Appendix
A. List of acronyms
B. Source code of prototype peak detection algorithm
4
116
116
117
WOMMONS
C. Peak detection algorithm computational burden determination
D. Table of test objects used
E. Equation-based statement of each of the discriminatory tests used
120
125
126
F. Matlab source code for object recognition prototype
130
References
138
5
List of Figures
Figure 2-1: System electronics block diagram
12
Figure 2-2: DSP Image Processing Board
15
Figure 4-1: Impact of noise on the surface detection algorithm performance
Figure 4-2: Illustration of acoustical time series
Figure 4-3: Hypothetical acoustical time series #1
Figure 4-4: Hypothetical acoustical time series #2
Figure 4-5: Hypothetical acoustical time series #3
Figure 4-6: Block diagram of the peak detection algorithm
Figure 4-7: Illustration of the peak detection algorithm
Figure 4-8: Worst case computational burden of peak detection v. processing block size
Figure 4-9: Computational burden of peak detection v. maximum examination window
Figure 5-1: Signal path of acoustical energy
Figure 5-2: Typical bistatic acoustical imaging signal levels
Figure 5-3: Drawing of target suspension apparatus
Figure 5-4: The dependence of scattered intensity on the wavenumber
Figure 5-5: Frequency content of recorded piston transducer acoustical pulse
Figure 5-6: Receive transfer function of AIS from PiezoCAD Transducer Design Report
Figure 6-1: Acoustical backscatter recordings from object #9 with vertical translations
Figure 6-2: Acoustical backscatter recordings from object #9 with horizontal rotations
Figure 6-3: Acoustical backscatter recordings from object #19 with vertical translations
Figure 6-4: Acoustical backscatter recordings from object #19 with horizontal rotations
Figure 6-5: Illustration of changing appearance of target with vertical translations
Figure 6-6: Example acoustical returns from each of the three classes of objects
Figure 6-7: Example recordings from thick-shelled PVC, aluminum, and brass pipes
Figure 6-8: Schematic illustration of the material determination statistic computation
Figure 6-9: Structure-based features in an example thick-shelled acoustical recording
Figure 6-10: Structure-based features in an example thin-shelled acoustical recording
Figure 6-11: Structure-based features in an example cylindrical acoustical recording
Figure 6-12: A typical transfer function estimate
Figure 6-13: Flow diagram of the object recognition prototype program
Figure 6-14: Histograms showing typical best and second best scoring template matches
Figure 7-1: Sample images taken from a precursor of the Acoustical Imaging System
6
22
25
25
26
26
36
37
43
44
47
49
51
53
57
58
63
64
65
66
68
71
75
76
79
84
85
88
92
___ 101
106
List of Tables
Table 6-1: Objects used for orientation-dependence determination
62
Table 6-2: Pipe wall thickness required to be thick-shelled
72
Table
Table
Table
Table
Table
Table
Table
Table
6-3: Third to second reflection ratio statistics for several materials
6-4: Summary of class-specific object recognition criteria
6-5: Confusion matrix for the OR prototype
6-6: Accuracy of the object recognition prototype at varying levels of added noise
6-7: Confusion matrix for the OR prototype with 0.01 std noise added to data
6-8: Confusion matrix for the OR prototype with 0.02 std noise added to data
6-9: Confusion matrix for the OR prototype for loosely controlled orientation data
6-10: Confusion matrix for the OR prototype in an open problem space
7
77
90
94
95
96
97
99
103
1 Introduction
1.1 Acoustical imaging
A great deal of recent work has been devoted to the development of high-resolution threedimensional ultrasonic systems for short-range imaging applications. Despite the limited
resolution of high-frequency (in the low MHz range) ultrasonic systems as compared to
optical systems, ultrasonic systems possess the advantage of having significant
penetrative range through many translucent and opaque media that render optical systems
useless. Medical ultrasound is a well-known application of ultrasonic imaging that takes
advantage of this penetrative capability into human tissue. Further, ultrasonic imagers are
relatively uninhibited by murky waters, which severely limit the performance of their
optical counterparts [1].
Therefore, diver-held high-resolution underwater ultrasonic
imagers have been developed for such applications as underwater search and rescue, pipe
and ship hull examination, and mine detection [1]. Lockheed Martin IR Imaging Systems
in Lexington, MA is currently developing such an imager.
It is referred to as the
Acoustical Imaging System (AIS) 1 .
1.2 Thesis overview
The body of this thesis is divided into six chapters.
Chapters 2 and 3 are primarily
background material. Chapter 2 provides an overview of the hardware used by Lockheed
Martin's AIS. Similarly, chapter 3 is devoted to the AIS software, particularly that which
runs on a set of digital signal processing chips.
Chapter 4 discusses the surface detection algorithm used by the AIS.
This algorithm,
which is performed in software by a set of digital signal processing chips, sorts through
the acoustical return data to extract the surfaces of imaged objects. A brief rationale for
the algorithm is presented. Next, features of the digital signal processor hardware that
impacted the algorithmic implementation are discussed. The algorithm is then presented
in detail, and a proof is presented regarding its correctness. Finally, the computational
Throughout the course of this thesis, many acronyms will be used. For an alphabetized listing of these
acronyms, see appendix A.
8
and communications burdens created by the algorithm are analyzed and found to be well
within the performance capabilities of the digital signal processors used.
An analysis of the data that will be gathered by the Acoustical Imaging System and a
comparison of this data to data available through an experimental system at Lockheed
Martin is presented in chapter 5. Although the AIS and realistic targets could not be used
for the experimental work regarding object recognition presented in chapter 6, this
chapter is intended to show that the conclusions reached with the experimental system
and targets which were available can be applied to the AIS. An explanation of why the
AIS can be considered to be a linear, time-invariant system is presented first. Next, the
dynamic range of the AIS is shown to exceed that used by the experimental system. The
methods of data acquisition and target presentation in the experimental system are then
presented. Also, the targets selections are presented. Finally, reasons for the applicability
of the experimental system and the targets used to the AIS imaging real-world targets are
offered, and specific differences are discussed.
Chapter 6 presents the results of an object recognition feasibility study performed with
the experimental system and test targets. Because the AIS was unavailable for this work
and the experimental system is composed of only a single piston transducer, all of the
object recognition work deals with only acoustical time-series data, and not image data.
The experimental data is shown to have some degree of viewpoint dependence. Due to
differences in the transducers, this viewpoint dependence may or may not hold for data
gathered by the AIS. Features that allow discrimination amongst the test targets are then
presented. These fall into three basic categories: material-based, structure-based, and
frequency domain-based features. Further, different features are found to be important
for different types of targets, and categories of targets are created, each of which
possesses category-specific features.
The prototype object recognition system is then
presented. Training data of known identity is used to build a series of templates. The
prototype object recognition system then uses these templates to determine the identity of
an unknown target by gauging the similarity of the target to each template. Next, the
9
performance of the object recognition prototype on a set of test data is assessed. Four
cases are examined: a closed problem space with highly controlled viewpoint and no
noise added, a closed problem space with highly controlled viewpoint and noise added, a
closed problem space with less highly controlled viewpoint and no noise added, and
finally an open problem space with highly controlled viewpoint and no noise added. The
performance of the prototype in all cases is shown to be excellent. Finally, it is asserted
that given the success of the object recognition prototype, implementation of an object
recognition system on the AIS appears viable.
Chapter 7 briefly presents suggestions for the implementation of an object recognition
system on the AIS. As a first step, two sample images from a precursor of the AIS are
presented. These images are used to motivate a discussion of how regions of interest may
be identified in an acoustical image. These regions of interest could possible be a known
object, and thus should undergo the object recognition algorithm. The use of an ensemble
of time series from the region of interest to improve recognition power is then discussed.
Next, incorporation of image-level characteristics into a recognition algorithm is
presented. Finally, different classification features were found during the creation of the
object recognition prototype to have varying amounts of discriminatory power across the
set of known objects. It is argued that by adding a step to the training process in which
object-specific weights are assigned to each classification feature overall recognition
performance can be improved.
10
2 System components
The Acoustical Imaging System can be thought of as consisting of seven subsystems,
each of which will be described briefly during this section. There are six subsystems in
the signal flow path of the system.
The first of these is an acoustical lens, which
eliminates the need for digital beam-forming electronics. The second is the Transducer
Hybrid Assembly (THA), which provides the system's transmit and receive capabilities.
Following the THA is the Acoustical Imaging Module (AIM), which controls the transmit
and receive timing, and performs the digitization of the received signals. Next is the
Image Interface Board (1IB). The IB receives a serial stream of digital data from the
AIM, formats this data for presentation to the signal processing electronics, and performs
some data preprocessing.
Located fifth in the signal pathway is the Digital Signal
Processor (DSP) Image Processing Board, which detects the surfaces in the received data
to create a range map of the imaged environment and creates the display for the user.
Finally, the last component in the signal pathway is a liquid crystal display (LCD) from
which the user views the image of the surrounding environment. Figure 2-1 below shows
the interconnections of these subsystems.
Additionally, there is a host computer
associated with the system that coordinates the actions and sets the operating modes of
each of the other subsystems.
11
Transmitt
Control
To Tower Aimp[ifier
Video
DSP Image ProcessingBoar[
Figure 2-1: System Electronics Block Diagram (adapted from [2])
2.1 Acoustical lens
While there has been some interest from the medical community in real-time threedimensional ultrasound (3DUS), it is particularly important that an ultrasonic imager for
diver-held underwater applications be capable of providing real-time range-based
representations of the objects in the surrounding environment. The requirement of realtime three-dimensional (3D) imaging, however, is at odds with another requirement on
any diver-held device - that the device be of a manageable size and shape. Traditional
ultrasonic imagers require extensive signal processing electronics to perform digital
beamforming [3].
Signal processing electronics imply power consumption, and this
power must be provided by a battery.
With today's battery technology, the amount of
signal processing that would be necessary to perform the digital beamforming for a realtime three-dimensional ultrasonic imager with a reasonable field of view (at least on the
order of 1,000,000 volume elements) would require a prohibitively large battery [3]. An
alternate approach that has been developed to avoid an exceptionally cumbersome battery
is an acoustical lens. An acoustical lens eliminates the need for digital beamforming and
requires no power, thus allowing a real-time 3D ultrasonic imager to be compactly
12
packaged for use by an underwater diver [3]. The acoustical lens of the AIS was designed
and fabricated at Lockheed Martin IR Imaging Systems.
2.2 Transducer Hybrid Assembly
The Transducer Hybrid Assembly (THA) is composed of a piezoelectric composite
transducer array of 128 by 128 elements and a transmit/receive integrated circuit (TRIC).
When pressure is incident upon an element of the piezoelectric composite transducer
array, a voltage is produced that is proportional to the level of the incident pressure.
Associated with each element of the transducer array is a miniature circuit containing
capacitors that can be used to capture samples of the voltage level at different times.
Further, there is circuitry in the TRIC that allows the values in the capacitors associated
with each transducer element to be read out and transmitted to the Acoustical Imaging
Module (AIM).
Likewise, the piezoelectric transducer elements are capable of
transmitting acoustical energy when driven with a voltage.
2.3 Acoustical Imaging Module
The Acoustical Imaging Module (AIM) is responsible for digitizing the signals it receives
from the THA. The digitized signals are then transmitted over a high-speed serial link to
the Image Interface Board. Further, the AIM is responsible for providing the TRIC with
its controls from the host computer.
2.4 Image Interface Board
The Image Interface Board (1IB) collects the digitized samples from the AIM and reorders
them to be in a traditional raster scan line format. Further, it performs gain and offset
correction to each of the samples that is recorded based on tables that have been
downloaded from the host computer.
Additionally, the THA performs quadrature
sampling on the acoustical data. In this mode of sampling, four samples are collected in a
coherent fashion with respect to the central transmit frequency. The IB transforms these
four quadrature samples into a single magnitude value. Finally, the IB preprocesses the
13
data in a way that allows the DSP Image Processing Board to ignore data in which there is
no useful information.
2.5 Digital Signal Processor (DSP) Image Processing Board
The Digital Signal Processor (DSP) Image Processing Board performs the surface
detection, image formation, and image processing tasks for the Acoustical Imaging
System. As mentioned before, the DSP Image Processing Board accepts raster scan data
from the IB. It then processes this data to detect the range of the first reflection at every
pixel. Alternatively, if there is no reflection present at a particular pixel, the DSP Image
Processing Board notes this occurrence. The range associated with each pixel then serves
as the range map of the imaged environment. The 128 x 128 element acoustical image is
expanded via bilinear interpolation to form a 256 x 256 image. Overlay and formatting
information provided by the host computer are then incorporated to form a 640 x 480
display. This 640 x 480 display is then transmitted to the LCD in RS-170 format.
Six Analog Devices 21060 SHARC digital signal processors are used by the DSP Image
Processing Board sub-system. Each member of the network of six Analog Devices 21060
Digital Signal Processing chips will be a member of one of three classes. While the
framework of the software running on each class is the same, the functions performed by
them differ greatly. First, there is a single data acquisition (DAQ) DSP chip. Second,
there are four peak detection (PKD) DSP chips.
Each one of these four chips will
perform the surface extraction algorithm for the data from one quarter of the pixel array.
Finally, there is a single display (DIS) DSP chip. The chips are connected in a ring-like
manner, with two parallel data paths through the PKD chips. The network topology,
along with their interrelation to the other system components, are shown below in figure
2-2.
14
Figure 2-2: DSP Image Processing Board network topology and interrelation to other system
components.
2.6 Liquid crystal display (LCD) screen
Planar Systems' LC640.480.21-065 is the liquid crystal display which will be used in the
Acoustical Imaging System. It is a 6.5" high performance color LCD.
2.7 Host computer
The host computer used in the Acoustical Imaging System is Ampro Computers
CoreModule/4Dxe single board computer. This single board computer is a 486DX-based
PC/AT compatible system.
15
3 System software description
Because of the reliance that has been placed upon the host computer to perform the subsystem coordination and a set of digital signal processing chips to perform much of the
data analysis, the Acoustical Imaging System developed by Lockheed Martin IR Imaging
Systems is very software intensive. Particularly, the functionality of the entire Acoustical
Imaging System depends heavily upon the software running on the set of DSP chips on
the DSP Image Processing Board. For this reason, the software running on the DSP chips
will be outlined in this section.
3.1 Common framework
As alluded to previously, the six DSP chips in the system will run three different program
sets. The software running on the data acquisition, peak detection, and video display
chips, however, shares a common framework.
Data is passed through the system in the form of messages. Messages are routed and
processed based upon two tables: a message routing table and a system event table.
The
entries in the message routing table are destinations for a given message in the current
processor. A destination can be either a queue associated with a communications link,
which will export the message to another processor for handling, or a queue associated
with a function, which will process the contents of the message. The entries in the system
event table are all of the functions that the processor can perform in its current operating
mode. There are three main operating modes: initialize, operate, and terminate. The
background task of each DSP chip cycles through the functions listed in the system event
table for the current operating mode, calling each one sequentially.
In general, the
functions listed in the system event table for the initialize and terminate operating modes
each execute one time before, respectively, the operating mode is updated to operate or
the Acoustical Imaging System ceases operation. The functions listed in the system event
table for the operate operating mode generally have a queue associated with them. As
each of the operate system event table functions is called sequentially, it checks its queue
to see if any messages await processing. If so, the message in the queue is examined, and
16
its data is processed as necessary. After examination and processing, the function alters
the message to indicate what has occurred, and then routes the message to its next
destination based upon the information in the message routing table. If no messages need
processing, the next function in the system event table is called.
3.2 Communications
Communication among various chips in the network occurs via interrupt-driven direct
memory access (DMA) data transfers. The use of DMA to perform the data transfers
means that very few core processor cycles are consumed by communication. For a data
export the core processor must specify only what block of data is to be transmitted and
out of which communications link the transfer is to occur. Similarly, for a data import the
core processor specifies how much data is to be received and where to store that data.
Following this specification, the core processor returns to normal operation while the
DMA controller carries out the transfer invisibly to the core. Upon completion of the
transfer, the DMA controller generates an interrupt to inform the core processor that the
transfer is complete.
Because the size of a message varies depending upon the amount of data contained in it,
the transfer of a message between DSP chips occurs in two stages. First, a message
header is transferred. This header specifies the type of message being transferred and the
amount of data in the message. Next, the message data is transferred.
To take advantage of the DMA capability of the system and make the communications
invisible to the background task running on each processor, interrupt handlers control the
communication over each port. Upon completion of the import of a message header, the
input DMA interrupt handler sets up second stage of the message transfer - the DMA
import of the message data based upon the information in the header. When the message
data transfer completes, the input DMA interrupt handler routes the message just received
using the message routing table and then sets up the import of the next message header.
The data export communications links operate somewhat differently. Each export link
17
has a first-in first-out (FIFO) queue associated with it. Messages routed for export to the
DSP which the link connects are placed in the FIFO. The function that places them in the
FIFO checks to see if there is currently an export occurring. If not, it starts the transfer of
the message header automatically. If so, it simply leaves the message in the FIFO to be
exported automatically by the output DMA interrupt handler. Upon completion of the
export of a message header, the output DMA interrupt handler starts the export of the data
associated with that header. Next, upon completion of the transfer of message data the
output DMA interrupt handler looks at the export FIFO. If there are any messages in it,
the output handler grabs the first one and starts to export its header. If there are no
messages in the FIFO, the handler simply returns.
To summarize, the use of interrupt driven DMA message transfers allows the data to flow
through the system without burdening the core processor. With interrupt handlers driving
the communication process, the background process can be greatly simplified. To the
background process, messages appear on the queues for system event table functions with
no effort.
Further, when a message is ready to be transferred to another DSP, the
background processor starts the export if no export is currently occurring over the
necessary communications link or places it in a queue if the link is currently busy. No
further effort is required by the background process.
3.3 Data acquisition (DAQ) processor
The data acquisition processor provides the interface of the DSP Image Processing Board
to both the Image Interface Board and the host computer. The interface of the DAQ to
the IIB is very similar to the interface used between the different DSPs in that data is
transferred over the same type of communications link.
However, the IIB has no
knowledge of the messages used within the DSP network.
Luckily, the data
transmissions from the IIB are always of the same size and the data is always ultrasonic
image data. Therefore, the DAQ always DMAs the IIB data into the data area of a
message and then applies a standard message header to that message.
18
The message
containing the IB data is then routed to a function which divides up the IB data and
places it into messages which will be routed to the proper peak detection processor.
The interface of the DAQ DSP to the host computer occurs through the use of a shared
memory space. Essentially, the host computer can read to and write from several of the
registers on the DAQ. One of the functions in the DAQ's system event table for the
operate operating mode polls this shared memory space and determines if the host has
placed an instruction there. If so, the DAQ acts upon the instruction, and then places a
return message in the shared memory space.
3.4 Peak detection (PKD) processors
The peak detection processors perform the peak detection algorithm described in chapter
4 of this document. A discussion of this algorithm, which extracts the front surface of the
objects viewed from the ultrasonic data, will be deferred to chapter 4.
3.5 Video display (DIS) processor
The video display processors accept surface range data from the PKD processors. The
DIS processors perform several image processing/enhancement operations on the data
before shipping it out over an RS-170 communications link to the LCD display. First, the
video display processor replaces the range data of any known defective pixels with the
median of the range data in the surrounding pixels. Second, the DIS processor attempts
to reduce image speckle by applying a median filter to the range data. At each pixel, the
median filter first computes the median range value of the surrounding pixels. Then, if
the current pixel's range value differs from this median value by more than some user
prescribed threshold, the current pixel's range value is replaced by the median value.
Third, the DIS processor applies a grayscale transformation to the range data at each
pixel. This step maps each one of the 80 range values to one of the 256 graylevel values
available in a manner which will allow the user to get some sense of the threedimensional shape of the imaged objects. Finally, a 2X interpolation is applied to the
19
data in both the horizontal and vertical directions to expand the image from 128x128 to
256x256 for display.
20
4 Surface detection algorithm
4.1 Introduction
This section contains an explanation of a DSP-based software algorithm used to detect
object surfaces in the Acoustical Imaging System's ultrasonic data. This algorithm will
identify the range at which the first object surface occurs for each location in the 128x 128
detection array and record both this range and the magnitude of the ultrasonic reflection
for each location.
The processing will be split among four Analog Devices 21060
SHARC DSP chips that will be operating concurrently.
4.1.1 Brief algorithm description
The algorithm selected to detect surfaces in the ultrasonic data searches for local maxima
in ultrasonic return magnitude. In order to be considered a possible surface, this maxima
must be above a certain threshold, called the signal threshold.
To be classified as a
surface, that local maximum must then be followed by a local minimum at least some
other threshold lower than the maxima.
This second threshold is called the noise
threshold. At present, the implementation used by the AIS searches only for the surface
nearest to the ultrasonic imager at each position in the two-dimensional detector array.
4.1.2 Algorithm justification
The algorithm briefly presented above was selected because of its robustness in the
presence of background noise. Neither low-level, highly-varying nor high-level, slightlyvarying noise sources will interfere with proper surface
detection under the
implementation selected.
The signal threshold allows all noise below a certain level of magnitude to be ignored.
This eliminates the detection of low-level noise as surfaces.
Noise with a mean
amplitude on the order of the signals to be detected by the image is more troublesome.
The influence of noise that varies about this mean by only small amount compared to the
reflected signal strengths can be eliminated using an appropriate noise threshold. High-
21
level noise with a high degree of variation, however, cannot be eliminated by any means.
Figure 4-1 illustrates this point graphically.
Surface
Surface
Signal
threshold
Surface
,
Surface
| Noise
A
threshold
a) No noise
c) High-level, slightlyvarying noise
b) Low-level, highlyvarying noise
d) High-level, highlyvarying noise
Figure 4-1: Impact of noise on surface detection algorithm. In part a, a series of samples are shown with
no noise added. In all subsequent parts, noise with differing properties has been added. In all parts, the
horizontal dashed line represents the signal threshold, and the dashed double-headed arrow represents the
noise threshold. Furthermore, the surface that is found by the surface detection algorithm is indicated.
With low-level, highly-varying noise (as in part b), the signal threshold protects against incorrectly labeling
a sample consisting of just noise as the surface. With high-level, slightly-varying noise (as in part c), the
noise threshold prevents a sample that is just noise from being named the surface. With high-level, highlyvarying noise (as in part d), however, neither the signal threshold nor the noise threshold can protect against
incorrect surface detection. The noise is at high enough levels that it surpasses the signal threshold, yet
varies more than can be protected against by the noise threshold.
Performing the noise threshold check in addition to just the signal threshold check for
each pixel's data series places a large computational cost on the system.
Naturally
occurring background noises provide a justification for incurring this high computational
cost. For example, the snapping shrimp Synalpheus parneomeris produces background
noise of just the type that can be eliminated through use of the noise threshold. In waters
less than 60 m depth at latitudes beneath 400, the noise produced by these snapping
shrimp is essentially omnipresent [4].
The ambient noise created by a large bed of
snapping shrimp has been described as being similar to fat sizzling in a frying pan [5].
Measurements of the acoustical signature of a single snapping shrimp were performed by
Au and Banks [5]. These measurements indicate that the noise produced by the snapping
shrimp is very broad in its spectral content, with energy across the entire 0 - 200 kHz
range and beyond. Further, the variation in spectral density over this range is only about
20 dB [5].
22
4.2 Background
4.2.1 Terminology and notational conventions
A pixel (short for picture element) refers to one of the locations on the 128x 128 detection
array. An image is a 128x 128x80 array of data. A plane is a 128x 128 array of data, all of
which occurred at the same distance from the acoustical imager. Therefore, there are 80
planes to an image.
Frames are made up of some number of consecutive planes.
Currently, there are 8 planes to each frame which implies that there are 10 frames
associated with an image, with planes 0-7 making up frame 0, planes 8-15 making up
frame 2, and so on.
If both the pixel and plane are specified, then a single data point in the 128x128x80 array
of acoustical data has been indicated. The data points are magnitude values that represent
the strength of the acoustical return at that pixel for a depth corresponding to the current
plane. For a (pixel, plane) pair called P, mag(P) will be used as a shorthand form to
indicate the magnitude associated with specific plane for a specific pixel. A (pixel, plane)
pair could also be said to refer to a specific voxel (the volume element analog to a pixel),
so mag(P) will also be referred to as the magnitude of a voxel.
Peaks are associated with a particular pixel. The magnitude values for the planes of a
neighboring pixel have no influence on where the peaks of a particular pixel are located.
Therefore, in the remainder of this document, by referring to the magnitude associated
with a plane A, or mag(A), indicates the magnitude at plane A of a particular pixel.
4.2.2 Definition of a peak
As stated above, peaks are associated with a particular pixel. There may be many peaks
associated with a pixel, or there may be none. A plane, A, is referred to as a peak if it has
a magnitude value associated with it, mag(A), that is greater than a prescribed minimum
value, called the signal threshold, and it is followed by a plane, B, with a magnitude
value associated with it, mag(B), that is at least some other specified value, called the
23
noise threshold, less than mag(A). There may be no planes with magnitude greater than
mag(A) between A and B if A is a peak. To summarize symbolically:
Plane A is a peak iff:
1) A > signal threshold
2) 3 plane B occurring after plane A such that mag(A) - mag(B) > noise threshold
3)
-3
plane C such that C is between A and B and mag(C) > mag(A)
The algorithm used by the Acoustical Imaging System is interested in the detection of the
first peak for every location on the detection array. There will be at most one first peak
associated with each pixel. That first peak will lie at some plane number. The algorithm
records this plane number and the magnitude associated with it. The following four
figures should help to make the above points clear.
24
I
I
z
Increasing plane
number within a 3-D
ultrasound image
A single pixel at
position (xy) in the
128x128 detection
array
y
x
Figure 4-2: Illustration of the type of data to be shown in all of the figures in this document. In each
of the next three figures, the return value from a single location will be shown. Figures will appear as
one-dimensional series because the return value for many different planes will be shown. Within each
figure, the plane number will be increasing from left to right.
First Peak
2nd local max
1st local max
4
i
noise
actual
drop
Figure 4-3: Example returns from a single (xy) position in the 128x128 detector array. No values
below the minimum threshold line will be considered as potential peaks, but they will be considered for
peak confirmation when applicable. Because the local minima following the first maxima fails to drop
the required distance and the second local maxima is of greater value than the first, the second local
maxima replaces the first local maxima as the candidate peak when it is encountered. The second local
maxima moves from being the candidate first peak to the selected first peak when it is shown to be
followed by a plane with a value low enough to meet the noise threshold criteria.
25
Figure 4-4: Example returns from a single (x,y) position in the 128x128 detector array. Note that
the first local maxima is considered to be the peak in this example. Even though the local minima
following the first local maxima fails to drop the required distance, the first local maxima is larger than
the second local maxima. Therefore, the first local maxima maintains its position as the potential first
peak. It is selected as the first peak when it is shown to be followed by a plane with magnitude low
enough to satisfy the noise threshold criteria.
Figure 4-5: Example returns from a single (x,y) position in the 128x128 detector array. Note that
despite the local minima following the first local maxima being below the signal threshold, it is still used
for the noise threshold check. Also note that despite the second local maxima being greater than the first,
the first local maxima is the first peak. Because the first local maxima meets the peak criteria, the second
local maxima is never considered.
26
4.2.3 Expected form of data for peak detecting DSP chips
The data will be shipped into the peak detecting DSP chips one frame at a time, with each
DSP receiving a specific fourth of the frame. For example, the first DSP may receive an
8 plane deep block corresponding to the pixel data for the top quarter of the detection
array, the second DSP may receive a block corresponding to the top-middle quarter of
pixels, and so on. Within each frame of data, the data for a pixel will be located in
contiguous locations, with the data from the lower plane number being located at the
lower memory location.
Further, each data point will have a 1 located in its most
significant bit (MSB) if there has not been a magnitude in that frame at that pixel
including and before the current plane that is above the signal threshold.
The peak
detection algorithm presented below is based upon these assumptions.
4.2.4 Memory requirements for each of peak detecting DSP chips
The peak detection algorithm will require the following three memory elements on each
of the four DSP chips:
*A 32,768 element array of 16-bit values for storing the frames of magnitude data.
*A 4,096 element array of "Potential First Peak" data structures; a more elaborate
description of this data structure will be given shortly.
e A counter to store the current plane number
4.2.5 Potential first peaks data structure
The "Potential First Peak", or PFP, data structure will consist of the following four fields:
1) an 8-bit value to indicate whether a supra-threshold magnitude value has been
encountered for the current pixel before the beginning of the current frame, 2) an 8-bit
value to store an indicator as to whether the first peak has been found at a pixel, 3) an 8bit value to store the plane number of a potential first peak, and 4) a 16-bit value to store
the magnitude at that plane number.
27
4.3 ADSP-21060 SHARC hardware considerations
An integral part of designing an efficient algorithm is optimizing the algorithm for the
hardware which will perform it. It is not enough simply to minimize the number of C
code instructions that must be performed by the program. The number of delays, or lost
clock cycles, should also be considered to minimize the total operation time of the
algorithm. The main sources of delays are interrupt handling, memory access conflicts,
and program control branches.
Interrupts occur relatively infrequently during the image processing.
As described
previously in section 3.2, two interrupts are generated for every message that is either
transmitted or received from a DSP.
Each of the PKD DSP chips will receive 10
frames/image * 15 images/second = 150 frames/second. However, because the network
of DSP chips as shown in figure 2-2 is not fully interconnected, the PKD processors are
required to pass through messages for each other. Thus, the front PKD DSP chips will
receive 300 messages/second with a frame of data and will transmit 150 of these
messages unaltered. Further, each PKD DSP will send peak detected data for each of the
15 images/second.
Thus the rear PKD processors will also be required to receive 15
messages/second containing peak detected data from the front PKD processors.
This
arrangement corresponds to a maximum load on any one processor of 465 ultrasonic data
messages that are either transmitted or received per second. This load corresponds to a
maximum of 930 interrupts per second for data. Therefore, a reasonable upper limit on
the number of interrupts that occur each second is 1050. Currently, interrupts are handled
with interrupt nesting disabled. While this strategy may increase the delay in the handling
of an interrupt, it decreases the number of overhead processor cycles required per
interrupt. This decrease is the result of the smaller number of registers whose states must
be pushed onto the stack prior to handling an interrupt if nesting is disabled. Under this
implementation, the maximum time to handle any of the interrupts by a DSP is 150
cycles. Therefore, an upper bound of approximately 160,000 clock cycles/second will be
occupied by interrupt processing. This number represents a loss of approximately 0.4%
of the processing capacity of each DSP chip.
28
Alternately, memory access conflicts and program control branches are likely to be
associated with the processing of each voxel. (128 x 32) pixels/plane * 80 planes/image
* 15 images/second = 4,915,200 voxels/second must be processed by each of the PKD
processors.
Therefore any unnecessary delays incurred in this area will substantially
affect performance.
For the purposes of discussing the memory access conflicts that will occur in the ADSP21060 SHARC during the execution of the peak detection algorithm, the memory of the
DSP can be thought of as being composed of three separate components: program
memory, data memory, and the instruction cache. The program memory and the data
memory are both 256 kilobytes in size. The instruction cache is 32 entries in size.
The ADSP-21060 maintains separate on-chip buses for both program memory and data
memory. Therefore, the processor core may simultaneously fetch the next instruction for
the pipeline and grab a piece of data from memory.
Further, the modified Harvard
architecture of the ADSP-21060 allows data storage in program memory.
This
modification, however, raises the possibility of data access conflicts if the core processor
is asked to both fetch an instruction and grab a piece of data from the program memory.
On such occasions, the instruction cache is checked to see if the instruction to be fetched
is stored there. If so, the two fetches may occur concurrently. If not, the data access
conflict will cause a delay. On the current cycle, the data in the program memory will be
fetched, and a no-operation (NOP) will be inserted into the instruction pipeline. On the
following cycle, the next instruction will be fetched from program memory and fed into
the instruction pipeline.
Therefore, it is desirable to minimize the number of data access conflicts that will cause
cache misses. This end can be achieved in one of two ways. First, the number of data
accesses to program memory can be reduced. Second, the instructions should be placed
29
in memory such that they reduce the number of cache misses when a data access conflict
does occur.
Minimizing the number of data accesses to program memory is the easier, and thus
preferable, method of preventing memory access conflicts. Minimization of the number
of memory access conflicts was a driving factor in the selection of a frame size of eight
planes. Because the acoustical return magnitude data and the potential first peak data
structures must be accessed during the processing of each pixel, it was decided that these
elements should be placed in data memory if possible. Each of the four peak detecting
processors is responsible for the data originating from a fourth of the detection array.
This corresponds to a 128 x 32 section, or 4,096 pixels.
The PFP data structure
associated with a pixel requires 1 byte to store the supra-threshold data indicator, 1 byte
to store the char indicating if a peak has been found at that pixel, 1 byte to store the char
indicating the plane number of the potential peak, and 2 bytes for the short int that stores
the magnitude of the potential peak. Therefore, overall, the array of PFP data structures
requires 20,480 bytes. This leaves 236 kilobytes available for the data in a frame. This
corresponds to 14 planes worth of magnitude data. As shown in figure 4-8 of section 4.7,
a frame size of 16 planes is the most efficient of the whole divisors of 80. However, the
above analysis indicates that this block size is prohibitively large and would generate
memory access conflicts. Therefore, a frame size of 8, which is slightly less efficient than
16, was selected. 2
With a frame size of 8, just over half of the data memory is used by the array of PFP data
structures and the frame of magnitude data. This arrangement leaves plenty of space
available in data memory for all of the other memory requirements of the system.
Note that the computational efficiencies of 8 planes and 10 planes to a frame are roughly equivalent. 8
planes was selected instead of 10 simply because it is a power of 2. Further, note that the use of use of 10
planes per frame would also reduce the number of messages to be passed per frame, and thus the interrupt
processing overhead, compared to 8 planes per frame. However, as mentioned earlier in section 4.3,
interrupt processing overhead consumes only a tiny portion of the processors' available cycles, thus this
concern is not of central importance.
2
30
Therefore, there will be no memory access conflicts, and it is not necessary to attempt to
place instructions in memory to avoid cache misses.
To understand how an algorithm should be implemented to reduce the delays caused by
program control branching statements, one must have knowledge of the instruction
pipeline used by the ADSP-21060 SHARC processor. The instruction pipeline contains
three stages: fetch, decode, and execute. These three stages of the pipeline allow a higher
instruction throughput for the system.
Any non-sequential program operation can
potentially decrease the processor's throughput. In addition to the interrupts mentioned
previously, these operations are: jumps, subroutine calls and returns, and loops. The
reason for the decreased throughput of these operations is that they may dictate that NOP
instructions be placed into the instruction pipeline behind them, generally for two cycles
until any ambiguity of program flow has passed. However, if it is possible to use a
delayed branch for a jump, call, or return, the throughput loss can be eliminated. In a
delayed branch, the two instructions following the non-sequential instruction are
executed.
As previously mentioned, loops are non-sequential operations.
decrease the throughput of the processor.
As such, they may
Thankfully, however, the ADSP-21060
SHARC processor supports loops with no overhead given that they meet certain criteria.
These restrictions are the following:
" Nested loops cannot terminate on the same instruction.
" There may not be a jump, call, or return operation in the last three instructions of
a loop.
" Loops of three instructions and below must be treated specially because of the
three instruction long pipeline.
There are no loops of size three or below used in the peak detection process, therefore the
reader is referred to the ADSP-2106x SHARC User's Manual for details on the special
treatment of this case [6].
By avoiding the cases outlined above, loop overhead was
eliminated from the peak detection process.
31
Further, overhead incurred by program
control statements was iteratively reduced by modifying the program arrangement based
upon the assembly code produced.
4.4 Required Preprocessing
4.4.1 Requirements
As stated above in section 4.2.3, it is assumed that the data reaching the peak detection
DSP chips will have the most-significant bit (MSB) set to 1 if no magnitude values within
previous and current planes at that pixel for the current frame have been greater than the
signal threshold. It is assumed that the data for the first plane to be above the signal
threshold and all subsequent planes within a frame will have their MSB set to 0. This
preprocessing is essential to the operation of the rest of the algorithm.
4.4.2 Expected method of preprocessing
It is currently expected that the required preprocessing will be handled by the Image
Interface Board (IIB).
4.4.3 Alternate preprocessing by the input DSP chip
The preprocessing cannot be handled by the single DAQ SHARC DSP which is currently
asked to accept the data from the IIB and then distribute it to the proper peak detection
DSP chips. Each program control branching statement on the DSP requires at least three
clock cycles. Any preprocessing algorithm must include a program control branching
statement, because it must be able to change whether it sets the MSB to 1 or 0.
Therefore, a lower bound on any preprocessing method of 3 operations per data point can
be established. Multiplying this lower bound by 128x128x80 = 1,310,720 data points per
image tells us that preprocessing each image will take at least 3,932,160 operations. At
the current acoustical rate of 15 images/sec, the preprocessing would then place a
computational burden of at least 15 images/sec * 3,932,160 ops/image = 58,982,400
ops/sec on the input DSP chip. The Analog Devices 21060 SHARC DSP chips that we
are using have a clock rate of 40 MHz. Thus, the input DSP chip would not be able to
keep up even with our optimistic lower bound of the preprocessing computational burden.
32
Therefore, because the DAQ DSP is not capable of performing the required
preprocessing, it must be handled by the IIB.
4.5 The Algorithm
4.5.1 English description of algorithm
Before each new image starts being pumped through the image detection algorithm, the
"Potential First Peak" data structure for each location will be reset to indicate that a new
image is just starting. This will be done by setting the supra-threshold value indicator for
each pixel to "No", the peak found indicator for each pixel to "No", the plane field
associated with each pixel to "No Return", and the magnitude value for each pixel to the
signal threshold.
When each new frame becomes available for processing, the input DSP will pass the
appropriate blocks of data to the peak detecting DSPs. Each pixel will then be examined
individually. For each pixel, it will first be determined if the data has passed the signal
threshold at that location before the beginning of the frame by looking at the suprathreshold indicator in the PFP for that pixel. For each pixel that has had data above the
signal threshold before the beginning of the current frame, it will be determined if a first
peak has already been found at its location. If one has, no further processing is necessary
at that pixel for this frame. If one has not, however, the block must be processed. If the
data has not passed the signal threshold by the beginning of the frame, then it will be
determined if the signal threshold has been passed by the last plane of the frame. This
task will be accomplished by examining the MSB of the last plane in the frame. If the
magnitude data for that pixel has not passed the signal threshold by the end of the frame,
then the pixel data for that frame should be ignored.
If the data for a pixel had already gone supra-threshold before the current frame but a
peak has not been found yet, the processing of that frame for that pixel should begin at
the beginning of the frame. However, if the plane at which the data first passed the signal
threshold is in the current frame, then that first supra-threshold plane should be the
33
starting position. To start processing at the plane for which the data first passes the signal
threshold, the program should start at the beginning of the current frame and simply
advance to the next plane while the MSB of the plane magnitude is 1. When a plane
magnitude with MSB of 0 is encountered, the program should then start fully processing
each plane. To minimize the number of program control branching statements that the
program must run, the above two cases should be handled separately.
To fully process the current plane, its magnitude should be compared to the magnitude of
the pixel's PFP magnitude. If the current plane's magnitude is greater, then the PFP array
entry for that location should be updated to reflect the current plane and magnitude. If the
current plane is less than or equal to the PFP magnitude entry, then the magnitude of the
current plane's return should be subtracted from the PFP magnitude entry.
If this
difference is greater than the noise threshold, then the potential first peak is a confirmed
peak. The potential peak should be marked as a peak and the processing of the frame
should stop for that pixel.
If the difference is not greater than the noise threshold,
however, then the next plane of data should be examined in a similar manner. For each
pixel, this process should continue until either the end of the frame is reached or a peak is
confirmed.
After every pixel in every frame of data for an image has been examined in the above
manner, the PFP array will contain the plane number and the magnitude of every valid
first peak found. This data must then be transferred on to its next destination in the
processing pathway.
For now, the algorithm does not deal with the boundary condition of the last plane. At
present, if the algorithm has a recorded potential peak but this potential peak is not
confirmed by the time the last plane of data is processed, then the potential peak is
ignored. The most obvious approach for handling this boundary condition is to consider
the plane of this potential peak to be the peak. However, this approach eliminates the
benefits gained through use of the noise threshold. By assigning any potential peaks to
34
automatically be a peak after all of the data has been processed, noise with a high offset
value but low variation which was eliminated by the noise threshold will be reintroduced.
The exact method of dealing with the boundary condition will therefore be deferred until
a later time when more actual data is available.
Figure 4-6 in section 4.5.2 provides a block diagram of the peak detection algorithm.
Additionally, figure 4-7 in section 4.5.3 offers an illustration of the peak detection
algorithm's performance.
Finally, for a listing of the source code that was used to
implement a prototype of the peak detection algorithm, see appendix B.
35
4.5.2 Block diagram of the algorithm
Start
pWhile imaging
Done
Yes!
No!
P,
For each frame
in image
Have we
processed every
frame for the
current image
Yes
Has every pixel for
the current frame
been processed?+
Receive frame from
input DSP
No!
For each pixel
inram
i rm
A
Has data been
this pixel?
No!
Has data been
No! encountered by the
end of this frame
No!
a
Start at
beginning of
frame
Yes!
,
Yes!
Note that we have
supra-threshold data
Advance to start of supra-threshold data
No!
For each plane from starting position
P- to the end of the frame
Update
ptnial first
Has every
Yes!
Is the current magnitude greater than the
| potential first peak magnitude for this pixel?
plane in the 4.peakentryl fr
frame for
he ielrfoNo
thpieNo
the current
pixel been
processed?
No!
Is the potential first peak magnitude minus the
current magnitude stripped of any masking it may
have greater than the noise threshold?
4 Yes!
Mark potential first peak to be a
peak and quit processing the current
pixel for this frame.
I
Figure 4-6: Block Diagram of the peak detection algorithm. The above block diagram of the peak
detection algorithm shows the control flow from the point of view of one of the four peak detection DSP
chips. Note that the examination of planes of data is shown as being a part of the same strand independent
of whether data was available before the beginning of the frame or not. In the actual implementation,
examination of the data should be handled separately for these two cases to minimize the number of
program control flow branching statements necessary.
36
4.5.3 Illustration of peak detection algorithm operation
Detected Peak
1St Block
2" Block
3 rd Block
4th Block
/\
First magnitude
"
processed 4Threshold
Last magnitude
processed
Signal Threshold
While loop
<
"
Figure 4-7: Illustration of peak detection algorithm. The figure above illustrates the operation of
the peak detection algorithm for a single pixel within the image. The planes of magnitude data
associated with the pixel are broken up into four blocks. The algorithm processes the first block for
every pixel within an image, then the second block for every pixel, and so on until it is done with the
entire image. At a particular location, the algorithm determines if the data has passed the signal
threshold by the end of the current block at that location. If it has not, then the block is ignored. If it
has, the algorithm then determines if a peak has already been found for the pixel. If one has, then the
block is ignored. If neither of these conditions hold, the algorithm determines the first location that
it should examine within a block and examines every data point from that position on in a block until
either the end of a block is reached or a peak is found. Note that there are only 40 planes of data
shown above and that the block size is 10. In the real implementation, there are 80 planes of data
and the block size will be 8. The figure above was scaled down to ease the illustration process.
37
4.6 Proof of Algorithm Correctness
To prove that the algorithm developed above is correct, two points will be demonstrated.
First, the peak detection algorithm will find a valid peak if one exists. Second, the peak
detection algorithm will find the first valid peak if a peak is found.
algorithm statement ignores the ending boundary condition.
Recall that the
Thus, an analysis of these
conditions will be omitted from this discussion.
Proposition #1: If the peak detection algorithm finds a peak, then
the magnitude
associated with that peak will be greater than the signal threshold.
Proof: At the beginning of each new image to be processed, each element in the PFP
array is initialized to hold the signal threshold in its magnitude field and the no return
value in its plane field. The peak detection algorithm will not start examining the
magnitudes of planes in a pixel until a value greater than the signal threshold is
encountered, and this is the location from which the algorithm will start looking. Further,
the peak detection algorithm will only replace the contents of a PFP array entry when the
magnitude for the current plane in the current pixel is greater than that in the entry. So
the magnitude value that is examined will be greater than the signal threshold, and that
magnitude and the plane associated with it will be written into the PFP array for the
current location. Moreover, since the magnitude associated with an entry in the PFP array
is non-decreasing, the value at this location will remain above the signal threshold
throughout the processing of the current image.
Therefore, it is essential that the
magnitude values associated with the planes at a location start being examined for a peak
to be found. Further, the beginning of examination implies that a value greater than the
signal threshold must have been written into the magnitude associated with the potential
peak location, and this value is non-decreasing. The preceding three points imply that
any peak detected must have a magnitude value greater than the signal threshold.
38
Proposition #2: Any peak detected by the algorithm will be followed by a plane at least
noise threshold beneath it in magnitude and will be the plane of greatest magnitude before
the confirming plane.
Proof: When the peak detection algorithm sees the first plane with a magnitude above the
signal threshold for a pixel, it examines every pixel from that plane until either a peak is
confirmed or the end of the image data is reached. Because all of the values ignored were
less than the signal threshold and examining the magnitudes associated with planes of
data implies that the PFP entry contains a value greater than the signal threshold, all of
the values ignored will be less than the PFP entry. When each plane is examined, its
magnitude is compared to the value stored at the PFP entry for that pixel. If it is greater,
then the current plane and magnitude will replace those in the PFP entry for the current
pixel. Therefore, this implies that the magnitude stored at the PFP entry for the current
pixel is the greater than or equal to the magnitude of every plane examined or ignored so
far.
If the magnitude for an examined plane is not greater than the value stored at the PFP
array entry for the pixel, then the magnitude of the current plane is subtracted from the
PFP array entry magnitude. If this difference is greater than the noise threshold, then the
PFP array entry is marked as a confirmed peak, and the pixel is no longer processed.
Otherwise the next plane of data is examined. So, the algorithm is guaranteed to have
stored the greatest value ignored or examined and the plane associated with it in the PFP
array entry for the current pixel, and the PFP array entry will be marked as a confirmed
peak stopping processing for the pixel when a plane with a magnitude value below the
PFP entry by at least noise threshold is found.
Proposition #3: If a valid peak exists, then a valid peak will be found.
Proof: Assume for the purposes of contradiction that a valid peak exists, but that no peak
is found by the algorithm. This implies that either the valid peak occurred before the
39
peak detection algorithm started examining plane magnitude data or it occurred at or after
the plane at which the peak detection algorithm started examining plane magnitude data.
If it occurred before the plane magnitude data started being examined, then the peak must
have had a value beneath the signal threshold. But this is a contradiction, because a valid
peak must have value above the signal threshold. Therefore, the valid peak must have
occurred after the plane magnitudes started being examined. Since no peak was detected,
this implies that there exists no plane magnitude which is followed by a plane magnitude
at least noise threshold below it. This conclusion, however, also leads to a contradiction,
because a valid peak must be followed by a plane magnitude at least noise threshold
below it.
Therefore, to avoid contradiction, a peak must be found if a valid peak exists.
Proposition #1 and proposition #2 then imply that the peak that was found must be above
the signal threshold and must be followed by a plane with magnitude at least noise
threshold less than the peak's magnitude, with no intervening planes of magnitude greater
than the peak magnitude. Therefore, if a valid peak exists, a peak will be found and that
peak will be valid.
Proposition #4: If at least one valid peak exists, then the peak detection algorithm will
detect the first valid peak.
Proof: By proposition #3 we know that if a valid peak exists, then a valid peak will be
found. Assume for the purposes of contradiction that more than one peak exists and that
the first peak is not the detected peak. This supposition implies that the peak detection
algorithm must have encountered a plane that was at least peak threshold lower than the
value stored in the PFP array entry for the current pixel and not marked the current peak
as a confirmed peak and stopped execution. This conclusion, however, contradicts the
fact that the peak detection algorithm always marks the current peak as a confirmed peak
and stops processing at a pixel when a plane magnitude at least noise threshold less than
40
the PFP array magnitude entry for the current pixel is encountered. Therefore, if a valid
peak exists, then the peak detection algorithm will detect the first peak.
4.7 Analysis of Computational Burden of Algorithm
In all of the computational burden analysis that follows, it is assumed that the number of
planes from the signal first passing above the signal threshold to a peak being confirmed
by the signal dropping at least noise threshold below the current potential peak may be
upper bounded by some parameter that is significantly less than the total number of
planes in an image, 80. This parameter will be referred to as the maximum examination
window. In the ideal case, the maximum examination window would be two, as the first
plane above the threshold would be the peak and it would be confirmed by the next
plane's magnitude. As the maximum examination window rises, the performance of the
peak detection algorithm presented gets steadily worse, as shown by the following
analysis. When the maximum examination window reaches approximately 25, then the
computational burden of the peak detection algorithm on each of the four peak detecting
DSPs can no longer be guaranteed to be less than the computational capacity of the DSPs.
While the existence of a maximum examination window and its length given that such an
upper bound can be guaranteed can not be demonstrated until more ultrasonic data is
available to examine, it is felt that the value of this parameter will most likely be in the
five to eight range.
In addition to the existence of a maximum examination window, the computational
burden analysis makes the following assumptions: 1) valid peaks are detected at every
pixel in the detection array and 2) each valid peak that is detected requires maximum
examination window planes to be examined. Both of these assumptions represent a worst
case as far as their contributions to the computational burden. That is to say that by
removing these restrictions, the expected computational burden of the peak detection
algorithm on each DSP would be reduced.
41
In all worst case data, the following restriction is added to the above constraints: the first
plane with a magnitude above the signal threshold is always the last plane in its respective
processing block. This constraint ensures that the maximum number of while loops must
be incurred before processing the data followed by the processing burden which must be
spread over at least two blocks. In all expected data, this restriction is lifted, and the first
plane above the threshold is assumed to occur with equal likelihood at any of the
locations within a block.
Finally, for the expected data analysis below, the location of the peak at a particular pixel
is assumed to be independent of all other pixels. While this assumption is certainly not
valid in reality, it reduces complexity significantly and almost certainly has a negligible
impact on the burdens calculated.
A sensitivity analysis of this assumption was not
performed because the results of the worst case data analysis indicated it was
unnecessary.
Figure 4-8 presents data on the worst case computational burden of the peak detection
algorithm as a function of the processing block size. Only block sizes that divide 80
evenly are considered. The data in this figure was computed for an expected examination
window length of five. As can readily be seen, increasing the frame size generates a more
efficient algorithm up to a point after which efficiency decreases as processing block size
is increased further. Less obvious from this figure is exactly which block size produces
the minimum worst case computational burden.
A block size of 16 produced the
minimum computational burden of all block sizes examined, with a worst case value of
approximately 17. x 107 operations/DSP/second.
However, as mentioned previously, a
frame size of 8 has been selected for the system. Because the performances of the frame
sizes 8 and 10, 19.7x10 7 and 18.2x10 7 operations/DSP/second respectively, are not
substantially worse than that of a frame size of 16 and these smaller frame sizes avoid the
memory access conflicts mentioned earlier, they were preferred over 16. A frame size of
8 was then selected because of a bias towards factors of two and because it eased the
design of the read-out integrated circuit (ROIC) to be part of the THA.
42
1.OE+08
9.OE+07
a.
8.OE+07
o7.0E+07
CL
5.0E+074.0E+07-'1,N;
Cpcit ofS
SP ch'p
3.0E+07-
2.0E+07
E 1.0E+07-1
0.0 E+001
2
4
5
8
10
16
20
40
80
Processing block size (planes/block)
Figure 4-8: Worst case computational burden of peak detection v. processing block size
Figure 4-9 presents worst case and expected computational burdens of the peak detection
algorithm on a single DSP chip as a function of the examination window length. In all
cases, the processing block size is assumed to be equal to 8 planes.
The data in figures 4-8 and 4-9 demonstrate that the peak detection algorithm should
easily be able to fit on to the four peak detection chips as long as the assumption of a
relatively short examination window length is valid. The excess processing power that is
available may be used for advanced image processing, classification, or display
algorithms. The assembly code listing from which the computational burden estimates
were derived and the specific method of calculating the computational burden are shown
in appendix C.
43
N Worst Case Computational Burden
M Expected Computational Burden
5.OE+074.5E+07-
capacit or sH ARc DsP chips
0 4.0E+07-,.
0.
0
1.5E+0740. 1OE+07
1.5E+07-
0
4.)
5.0E+06-
2
3
4
5
6
7
8
9
10
11
12
Maximum examination window (planes)
Figure 4-9: Computational burden of peak detection v. maximum examination window
4.8 Analysis of the communications burden of the peak detection algorithm
Because the ultrasonic data communications that occur are the same for every image, the
analysis required to estimate the commirunications burden that the peak detectioni
algorithm places upon the DSP network is much less complicated.
The maximum
amount of data that must be passed over any link is half of an image 15 times per second.
This amount of data must be passed over the links connecting the DAQ to the JIB and the
link connecting the DAQ to the front two PKDs as shown in figure 2-2. This corresponds
to 128 x 64 pixels/half plane * 2 bytes/pixel * 80 half planes/image * 15 images/second =
19,660,800 bytes/second. The maximum amount of data that can be passed over a DSP
to DSP link by DMA transfer is 1 byte/clock cyle * 40 MHz = 40 megabytes/second.
Therefore, the communications links connecting the DAQ and the PKDs is under half
44
burdened by the ultrasonic data transfers and has plenty of room left for other messages.
The maximum amount of data that can be transferred over the IIB to DAQ link, however,
is limited by the transfer speed of the IIB.
The IIB is also capable of transferring 1
byte/clock cycle, however, its internal clock operates at 33 MHz, not 40 MHz like the
DSPs. Therefore, the IIB to DAQ transfers can occur at a rate of 33 Mbytes/sec. Again,
this transfer rate is sufficiently high, especially since no other messages are ever asked to
pass over these links.
4.9 Conclusions for the software peak detection algorithm section
This chapter has shown that the peak detection algorithm can viably be performed in
software. Further, by moving the peak detection from hardware in the IIB to software in
the DSP Image Processing Board, the DSP chips have been given access to the full set of
acoustical image data. Both the computational and communications burdens created by
the algorithm are under half the capacity of the DSP chips currently slated to be in the
AIS. Thus, a considerable amount of processing power is still available. Therefore, new
features may be added to the system's software which make use of this power. An object
recognition algorithm, like that presented in chapters 5 through 7, is an example of
exactly this type of feature.
45
5 Analysis of data gathered by Acoustical Imaging System
5.1 Introduction
Before an object recognition (OR) system may be built, a device capable of gathering data
which may be used in the object recognition process must exist. Lockheed Martin IR
Imaging Systems' Acoustical Imaging System is not currently developed enough to attack
the OR problem. An experimental system available at LMIRIS, however, is capable of
gathering finely sampled acoustic time series. It is desired to determine whether the
experimental system is a good tool to assess the ability of the AIS to perform object
recognition. To accomplish this task, the degree to which the Acoustical Imaging System
may be modeled as being linear and time invariant (LTI) will be explored. If it is found
that the system is LTI, the ability to use LTI methods will greatly simplify the data
analysis process. Next, the dynamic range of the AIS will be discussed. An explanation
of and justification for the experimental set-up at LMIRIS will then be presented,
followed by a discussion of the relation of this experimental set-up to the Acoustical
Imaging System. Based upon the preceding analysis, it will be concluded that data from
the experimental system can be used to help establish the feasibility of using Lockheed
Martin's AIS for object recognition.
5.2 Assumption of linear time invariant (LTI) system performance
The system electronics diagram of figure 2-1 shows the signal flow path through the
electronics of the AIS. As can be seen in the figure, all of the components after the
analog to digital (A/D) converter in the AIM are digital. Further, commercially available
A/D converters can be considered to be LTI devices to within a high degree of accuracy.
Therefore, all of the system behind the THA can be considered to be LTI.
The task of assessing the degree to which the AIS performs as an LTI system is thus
reduced to assessing the linearity and time invariance of the THA, the acoustical lens, and
water. Figure 5-1 below illustrates the signal path that must be evaluated.
46
Figure 5-1: Block diagram of signal path of interest in evaluation of LTI assumption
Previous analysis by Lohmeyer in [2] has already addressed the degree to which the
assumptions of linearity and time invariance apply to the AIS for the transmit transducer,
water, and receive transducer. In short, Lohmeyer found that over the majority of the
output range of the Transducer Hybrid Array, the receive performance of THA is linear to
within experimental measurement limitations [2].
Further, Lohmeyer makes a strong
argument that despite the non-linearity of the acoustical transmission of water in the 1-3
atmospheres pressure range the system will encounter, these non-linear effects can largely
be ignored. The non-linearities in water produce harmonics in the fundamental frequency
of the propagating wave [2 referencing 8]. Water, however, behaves much like a lowpass filter. Above 1 MHz, the attenuation caused by acoustical propagation through
water increases substantially with frequency. Thus, the higher harmonics will have been
attenuated dramatically more than the original frequency during propagation. Therefore,
although the water produces non-linearities, this effect may be ignored [2].
Although beyond the scope of this thesis, a few brief comments can be made to justify
neglecting the non-linear aspects of sound propagation through the lens in the AIS. The
lens is composed of polystyrene. The tensile strength of the polystyrene in the lens can
conservatively be estimated to be 35 MPa. Further, the AIS transmits at approximately
60 dB re lPaIV, meaning that an input of 1 V to the transducer will produce an output
pressure of 1000 Pa. The input voltage is always kept well below 1000 V, therefore, an
upper bound of 1 MPa can safely be applied to the transmission strength. Because this
upper bound is less than 3%of the tensile strength of polystyrene, it can be assumed that
the acoustic pulse does not induce the polystyrene to move out of the linear region of its
47
stress-strain curve. Therefore, the propagation of sound through the acoustical lens in the
AIS can be assumed to be approximately linear [8].
Because each of the components of the AIS has been shown to behave linearly or nearly
linearly, the assumption of a linear time invariant (LTI) system is fairly accurate. With
this assumption, the well-developed tools of LTI analysis may be used. The use of LTI
analysis will significantly simplify the mathematical and conceptual details of the
investigation in this thesis.
5.3 System dynamic range analysis
The dynamic range of the Acoustical Imaging System is limited by the system noise-level
on the low end and by the non-linear effects of acoustical pulse transmission in water,
which place a limit on the amount of power which may be used for a pulse, on the high
end. Further, as the AIS is developed further and monostatic operation incorporated into
the THA, the THA's power transmission capabilities may impose an even tighter limit on
the high end of the dynamic range.
However, in bistatic operating mode, the AIS
dynamic range is shown below in figure 5-2. As can be seen below, the instantaneous
dynamic range of the system is approximately 60 dB. Note that over much of the range of
interest planar reflectors and specular targets produce return levels above the system
maximum signal level. This situation can easily be remedied by reducing the transmit
power [3].
Furthermore, recent calculations indicate that the instantaneous dynamic
range of the bistatic AIS could well rise to approximately 75 dB [8]
48
>
20-
0
0
Planar Perfect Reflector
-20
-40
3
~~
~"Point"
---
60-60
Cl)
---
-80
-100
Specular Target (-20 dB)
Target (-30 dB)
Maximum Signal Level
System Noise Level
Range - Meters
Figure 5-2 Typical bistatic Acoustical Imaging System signal levels (taken from [3])
5.4 Experimental set-up: explanation, justification, and relation to AIS
For several reasons, it was impossible to use the AIS itself for the examinations
performed in the second half of this thesis. First, the Image Interface Board mentioned
during the first half of the thesis is a second generation board which had not been
completed as of the writing of this thesis. Therefore, only the first generation IIB was
available. This version of the IB performs the peak detection in hardware before the data
is transmitted to the DSP Image Processing Board. Therefore, a user of the AIS does not
have access to the full time series of acoustical returns which are essential for the type of
object recognition analysis which is intended. Second, the technology for the Transducer
Hybrid Assembly is still in development. While there has been success in this area, the
availability of THAs is limited. Further, it is desired to not expose any of the few THAs
to the extensive use necessary for all of the object recognition (OR) data gathering.
Because of these reasons, an alternate experimental set-up had to be developed for the OR
work.
This section of the thesis will first describe the experimental set-up that was
developed, including both the data gathering system and the test objects selected. Next, a
brief justification of this experimental set-up will be presented. Finally, the relation of
this set-up to the actual AIS will be discussed.
49
5.4.1 Data acquisition
Panametrics V380 piston transducer was operated in transmit/receive mode in
conjunction with Panametrics Pulser/Receiver Model 5072PR. Data was collected by a
Gage 6012PCI data acquisition card which was set to sample the received data at 30 MHz
using an external clock. Data was gathered in a water tank of length 100 cm, width 50
cm, and depth 40 cm.
During data acquisition, the water depth in the tank was
approximately 25 cm.
5.4.2 Target selection
The chief underwater imaging task currently foreseen for the Acoustical Imaging System
is underwater mine detection. There is great variability in the size of underwater mines.
They range from the extreme small of the German DM 221, a cylindrically shaped mine
with a diameter of 65 mm and a length of 145 mm, to the extreme large of the Iraqi Al
Kaakaa/16, a roughly box shaped mine composed of stacked cylinders and plates with a
length of 3.4 m, a width of 3.4 m, and a height of 3.0 m, and the Russian SMDM-2, a
cylindrically shaped mine of diameter 0.7 m and length 11 m. Based upon a survey of the
more than 50 mines listed in Jane's Underwater Warfare Systems [9], approxiamtely 70%
of the mines listed had a roughly cylindrical shape.
Further, a typical size was
approximately 1.0 m diameter and 2.5 m length. While objects at the lower end of this
size spectrum are of reasonable dimensions for experimental work, the typical and large
objects are prohibitively large. Because of the size of the tank available for this work,
objects will be restricted to be smaller than approximately 100 mm in diameter and 400
mm in length. It will be argued later in section 5.4.4 that the results obtained with these
smaller objects are generalizable to the larger targets that may be encountered in actual
mine hunting operations.
Because of the high prevalence of cylindrical shapes in underwater mines and the
geometrical simplifications possible due to cylindrical symmetry, pipes and rods were
selected as the primary test targets. Plates were also purchased when available for a
particular material. Copper, brass, aluminum, stainless steel, polyvinylchloride (PVC),
50
and pine were selected as the materials. Further, object diameters spanning the range of 5
to 90 mm were selected. Additionally, for the pipes, pipe wall thickness varied over the
range 0.5 to 25 mm. A full listing of the 41 test objects used is provided in appendix D.
5.4.3 Target presentation
The targets were suspended from a horizontal crossbar using two loops of equally long
fishing leader line. The horizontal crossbar was clamped to a vertical support bar. This
vertical support bar was then attached to a cranking system that allowed the horizontal
and vertical position of the target to be precisely adjusted in increments as small as 0.015
mm. This arrangement is shown pictorially below in figure 5-3. Targets were suspended
with their long axis parallel to the bottom surface of the water tank. Further, they were
presented with their long axis perpendicular to the surface normal of the face of the piston
transducer.
Point of attachment
to vertical and horizontal
positioning system
-0
Suspended Target
Figure 5-3: Drawing of target suspension apparatus.
The target was positioned so that it was
approximately halfway between the surface of the water and the tank bottom. Further, the target was
centered horizontally within the tank.
Due to the experimental set-up conditions available, there may have been some deviation
from the parallel and perpendicular positions described above.
In particular, the pipe
was placed parallel to the tank floor by measuring the height of each side of the
overhanging crossbar to ensure that both sides were the same. During the course of
51
experiments, there may have been some variation from horizontal as the set-up was
changed and the bar was bumped.
Because this type of variation was minor and
maintained the normal angle of incidence between the piston transducer and the target,
great pains were not taken to eliminate it. Additionally, the normal relationship between
the long axis of the target and the surface normal of the face of the piston transducer was
established in a similarly ad hoc manner. Using a measurement strategy similar to that
mentioned above, the horizontal crossbar was positioned so as to be parallel to the back
wall of the water tank. Further, before each data acquisition the angle of the transducer
could be adjusted to maximize the strength of the received signal.
This approach would
ensure that despite limitations in set-up accuracy and disturbances after set-up the same
relationship would exist between the transducer and the target surface on every data
acquisition. In practice, however, the angle of the transducer was not monitored very
closely. As will be shown in section 6.2, data measurements turned out to be relatively
invariant to minute variations in the angle.
5.4.4 Experimental set-up justification
The decision to use cylindrical test objects has already been justified based upon the
representative nature of that shape in section 5.4.2. However, as noted at that time, the
size of the test objects used for this work is significantly smaller than a typical sea mine.
Therefore, it is also necessary to show that the results for these smaller test objects are
generalizable to sea mines.
While in general the scattering by targets of sound is very complex, for the relatively
simple geometry of a cylinder and for wavelengths that are very small compared to the
circumference of the cylinder, the scattering analysis can be greatly reduced through use
of "geometrical acoustics".
In this short-wavelength limit, the scattered wave can be
thought of as splitting into two parts: the reflected and the shadow-forming waves.
Further, the scattering behavior of cylinders as a function of the wavelength of the
acoustical energy incident upon them reaches an asymptote in both the total intensity of
sound scattered and the directionality of the scattered sound.
52
While for very long
wavelengths compared to the cylinder circumference relatively little sound is scattered
and that which is scattered is directed primarily backward, as the incident wavelength is
shortened the scattering pattern becomes much more complex with more of the scattered
energy traveling in the forward direction. For very short wavelengths, the directionality
of the scattered pulse is very nearly constant, with approximately half of the scattered
energy directed forward in a very sharp beam and the rest of the energy approximately
uniformly distributed.
While the total intensity scattered very quickly reaches its
asymptotic behavior, the directionality of the scattered pulse is somewhat slower to reach
its short-wavelength limit. Figure 5-4 shows the total intensity of scattered sound from a
cylinder as a function of the ratio between the cylinder circumference and the sound
wavelength. The evolution of the scattering directionality from a cylinder at increasing
circumference to wavelength ratios shows similar asymptotic behavior [10].
Total
intensity
scattered
0
1
2
3
4
5
Figure 5-4: The dependence of scattered intensity on the ratio 2nra/
incident upon a rigid cylinder of radius a (adapted from [10])
for sound with wavelength X
Of the 39 cylindrical test objects selected for the object recognition feasibility study, all
possess a radius (a) of greater than 3.175 mm. This implies that all possess a wave
number, ka = (2n/)*a, of at least (27c/0.5 mm)*3.175 mm ~ 40. Further, all but two of
the test objects possess a radius of at least 6.35 mm, indicating a wave number greater
than approximately 80. Based upon information from Morse and Ungard [10], these
wave numbers indicate that the scattering behavior of the cylinders is safely into the
53
asymptotic regions of both scattering intensity and directionality. Therefore, for the test
objects selected, the short-wavelength limit applies, and it can safely be stated that the
results obtained will be generalizable to larger objects such as sea mines.
5.4.5 Relation of experimental system to Acoustical Imaging System
The use of the piston transducer-based experimental system instead of the Acoustical
Imaging System for this work introduces some discrepancies between the problem space
being explored for the object recognition feasibility study and the problem space in which
an object recognition system would be applied. It is important that these discrepancies be
understood so that the knowledge gained in this work may be correctly interpreted. There
are two main differences between the experimental system used in the object recognition
feasibility study in the second half of this thesis and the Acoustical Imaging System.
First, the data acquisition of the two systems varies in terms of spatial resolution of the
acoustical pulses, signal-to-noise ratio, length and frequency content of the transmitted
system, and in the sampling period of the received signal.
Second, the experimental
system is manually set up so that the spatially and temporally interesting regions of the
imaging environment are recorded, but no such assurance may be made for the AIS.
The first difference mentioned above is caused by the use of the piston transducer instead
of the Acoustical Imaging System's focused transducer array elements for data
acquisition.
The piston transducer produces an acoustical pulse that projects directly
forward for a distance approximately equal to the surface area of the piston transducer
divided by the wavelength of pulse (ira 2/X).
For the experimental set-up, with a piston
transducer diameter of approximately 1.125" = 2.86 cm and a wavelength of 0.5 mm, the
near field - far field transition point, or na2/A distance, occurs at approximately 125 cm.
Therefore, over the distances used in the experimental set-up (typically approximately 20
cm to the target), the acoustical pulse from the piston transducer will maintain its circular
shape [11]. In a surface area equal to that of the piston transducer on the Transducer
Hybrid Assembly, there would be many acoustical elements.
Indeed, the piston
transducer head is a circle of approximately 1" in diameter, while the THA is
54
approximately a 2" by 2". Therefore, there would be approximately 3200 THA elements
contained within the surface area of the piston transducer. When imaging an object that
is in focus, each of these elements would project itself through the acoustical lens onto
the surface of the target and then back. Under these circumstances, the data gathered with
the Acoustical Imaging System would have much more spatial resolution than the
relatively coarse data that is now gathered with the experimental set-up. This higher level
of detail should prove valuable in object recognition, opening up a range of features that
were not available using just the piston transducer. As the imaged target falls out of
focus, the received image will blur. The signal received at any one THA element will
thus be the convolution of the signal that would have been received at each of the
elements in the neighborhood of the element, with the size of the neighborhood
increasing as the image becomes more out of focus. Even for severely distorted images,
however, the image will still contain a significant amount of information about the spatial
variations in the received signal that should be useful for object recognition.
During all of the data collection with the piston transducer, the maximum output signal
voltage was kept at or below 2 V.
Because the baseline noise level in the piston
transducer and electronics used was slightly higher than 0.02 V, this implies that at any
one time, no more than 40 dB of dynamic range was used. The settings used in data
collection, however, had to be changed to get the same magnitude of signal from smaller
pipes as from larger pipes and plates. At the settings used for the smallest pipes, the
largest pipes and plates would have produced returns in the range of 8 V. Therefore, in
total, approximately 52 dB of dynamic range was used.
These figures compare very
favorably with the 60 dB of instantaneous dynamic range available to a detector element
in the AIS, and the considerably greater dynamic range that can be achieved by adjusting
the output signal level. Further, if one assumes that detector noise is uncorrelated across
elements, because there are many elements in the AIS, averaging of received values over
intelligently selected regions of the detection array may further increase the SNR by a
factor of 1N, where N is the number of elements averaged. Such an approach involving
the use of composite signals from a region of detector elements could significantly
55
increase the dynamic range beyond the 60 dB figure previously quoted. For example, use
of 100 elements could increase the dynamic range to 80 dB. Therefore, it appears that the
AIS will have more than adequate dynamic range available to perform the type of object
recognition tasks accomplished using data from the piston cylinder.
Additionally, the length and frequency content of the transmitted pulse from the piston
transducer and the THA differ. While the exact length of the pulse to be transmitted by
the THA is not yet set, it is currently planned to be about three times longer in duration
than the pulse used by the experimental system. This difference will have two effects: 1)
the temporal resolution of the AIS will be lower than the experimental system, and 2) the
transmitted pulse will have less bandwidth. Shown below in figure 5-5 is the frequency
content of the recorded acoustical pulse from the piston transducer. To form this estimate
of the frequency content, ten recordings of the acoustical pulse's reflection off of a thick
(~ 15 cm), planar, stainless steel block were taken. From each recording, 5 gsec (150
samples) of the reflection was saved. The saved portions of each recording for the ten
reflections were then appended together into a single file.
The spectrogram of the
appended file was then computed using Matlab's psd command with a 1024 point Fast
Fourier Transform (FFT), a window size of 100, and an overlap of 80.
In this
implementation, the input signal is divided into overlapping sections, each with length
window size. Each section is then windowed using a Hanning window and zero-padded
to match the length of the FFT. The magnitude squared Discrete Fourier Transform
(DFT) of each section is then computed, and these results are averaged to form the
frequency content estimate [12].
56
0
-5
..-..
...
....-..
.....-..
...
.... ..-..
.-....
...
...-..
...
....
.-..
....
...-..
.....
.....
...
- ..
....-..
....
...-..
...
....-..
-10
-15
-
-20
- -
-
- ..................
.
- .............
- ..........
......
- - ..-...... .....
...........-..
U)
C
-
E -25
-30
................
...
- .............
- ...........
....
- .......
...........
.
-...........
...-.
.... ...............
.........
- -. - -.
- - .....
-
U)
...........................-....
...............-
....
.......-..
-........
...
-......
.....
....
...
.....
..-..-..-..-..-..-..
.....
..... ..-.......
...
......
-...
..........
.-........
-35
0
CL -40
-45
-50
0
1
2
3
Frequency
4
5
6
x 106
Figure 5-5: Frequency content of the recorded piston transducer acoustical pulse
Shown in figure 5-6 is the receive transfer function of the bistatic AIS transducer. At I
MHz from the central frequency, the receive transfer function has an attenuation of
approximately 13 dB.
However, the power spectral density of the recorded piston
transducer acoustic pulse has only dropped by approximately 8 dB at 1 MHz from the
central frequency. Further, the power spectral density of the recorded acoustic pulse from
the piston transducer shown in figure 5-5 takes into account both the transmitted pulse
shape and the receive transfer function. The receive transfer function shown in figure 56, however, has not been affected by the frequency content of the pulse to be transmitted.
Therefore, the difference between the frequency content of a recorded pulse from the
piston transducer and a recorded pulse from the bistatic AIS would be somewhat greater.
57
R eceive T ransfer Function - V(R x)/P(Pascals)
45.000
180.000
Vo/Pi
T heta
dB V/MPa
Degrees
-5.000
2.000
.
Frequency in MHz
44.227 dB V/M Pa / -93.050 Degrees at 3.000 MH z
-180.000
4.000
Figure 5-6: PiezoCAD design report plot showing the expected frequency content of the transmitted
pulse from the bistatic AIS. In the graph above, the smooth curve near the top represents the magnitude
portion of the bistatic transducers' receive transfer function. The y-axis label of the plot above, Vo/Pi,
stands for voltage out relative to pressure in.
As can be seen by comparing the data presented in figures 5-5 and 5-6, the frequency
domain characteristics of the AIS and the piston transducer are far from identical.
However, they are qualitatively similar in that both have the greatest sensitive at
approximately 3 MHz. Therefore, while data gathered using the piston transducer would
be inappropriate for predicting the expected return from an object imaged with the AIS,
the same analysis techniques used to perform object recognition with piston transducer
frequency data should be applicable to data gathered with the AIS. Because a narrower
range of frequencies are present, however, it is likely that the usefulness of frequencydomain classification characteristics will be lower in data gathered by the AIS than the
piston transducer.
The final difference between the experimental set-up and the AIS due to the use of the
piston transducer for this work has to do with the sampling period. The AIS quadrature
samples its data. In the present application, the AIS samples data at four times the center
frequency of transmitted pulse, 3 MHz. Because we are capturing data at 12 MHz, the
AIS will be capable of sampling data containing frequencies of up to 6 MHz before
aliasing becomes an issue. Signals were captured from the piston transducer using an
58
externally generated 30 MHz clock. At this data acquisition rate, the experimental system
was capable of capturing data at frequencies of up to 15 MHz before aliasing is
introduced. Based upon the examination of the power spectral densities (PSD) of many
captured signals, almost all of the frequency content of the returned signals is below 5
MHz, with the frequency content at 5 MHz generally being reduced by a factor of 30 to
100 over the content at the center frequencies in the 3 - 3.5 MHz range. Additionally,
while there are on occasion some low level features of interest in the 6-10 MHz range of
the PSDs, the vast majority of interesting features are located below 6 MHz. Finally,
essentially all of the frequency content above 10 MHz is noise. Therefore, although the
AIS is not capable of sampling the data at the higher rate that was used in the
experimental set-up, it appears that aliasing of the returned signal is not a major issue.
Furthermore, since the vast majority of the interesting frequency domain features are in
the range below 6 MHz, the frequency domain-based recognition power of the AIS
should not be adversely affected by its lower sampling rate.
The second class of differences between the experimental set-up and the AIS arose
because the experimental set-up is specifically designed to capture interesting signals.
Before a piece of data is acquired, the target and the transducer are positioned so that a
strong signal will be received. Further, due to file size constraints, only those portions of
the acquired signal containing the reflections of interest are stored for later analysis. With
the AIS, however, there is no such guarantee that the signals recorded will be of use in
object recognition.
Obviously, the system operator must first encounter and image a
target. Further, within the volume of data acquired, only a small subset will be useful in
object recognition. A mechanism must be put in place, whether automated or operator
controlled, to allow the system to identify such regions of interest. This topic will be
addressed further in chapter 7.
5.5 Conclusion
Such differences between the Acoustical Imaging System design and the experimental
system as the AIS's longer acoustic pulse and narrower frequency band will make the
59
object recognition task more complicated for the AIS than the experimental system. On
the whole, however, the type of data available from the two systems is very similar: low
megahertz underwater acoustic data from generally cylindrical targets that are very large
compared to the wavelength. Further, the AIS, possesses more dynamic range than used
by the experimental system and has an ensemble of transducers instead of just one. These
factors should serve to make the object recognition task less complicated for the AIS than
the experimental system. Moreover, because it is still in the design stage, future changes
may be made to the AIS. Indeed, demonstration of a viable object recognition algorithm
using information like that available from the experimental system may provide
motivation for such changes. Therefore, development of a prototype object recognition
algorithm to work with the data available from the experimental system is deemed to be a
worthwhile endeavor.
60
6 Object recognition feasibility study
6.1 Introduction
To determine the feasibility of performing object recognition with the AIS, this chapter
presents a prototypical object recognition system built for an experimental system that has
access to similar data to the AIS. While a great deal of work has been devoted to object
recognition in similar systems such as synthetic aperture radar and sonar that also possess
an inherent noisiness or speckle, no work has been done on the development of object
recognition algorithms for a low megahertz acoustical imager [13, 14].
This chapter begins by presenting the target orientation dependence found in the
experimental data. This viewpoint dependence may or may not be present in data from
the AIS.
The features used during object recognition and the rationale behind their
selection are then presented. In general, it is found that no single feature is effective for
classifying all of the test objects, so many features must be developed. Next, the object
recognition prototype that was developed to work with the data gathered by the
experimental system is presented. Finally, the object recognition system's performance is
presented in a variety of situations. The performance of the system is analyzed as noise
and viewpoint uncertainty are added to the data. Further, the performance of the object
recognition prototype is explored under an open problem space.
In all cases, the
performance of the system is found to be quite good.
6.2 Dependence of acquired data on viewpoint
Early in data collection, it was noted that specular reflections from a target could appear
qualitatively different. In particular, it was observed that the return signal was sensitive
to translations of the target with respect to the transducer in the vertical direction. Said
another way, as the targets location was adjusted up or down with the cranking system,
the observed reflection kept approximately the same amount of power but varied
qualitatively.
These variations were symmetric about the centerline of the pipe and
manifested themselves through strengthening and weakening of the different peak
amplitudes in the reflection. Furthermore, minor variations in the reflected signal were
61
observed as the angle of the transducer was varied in the plane containing the target
surface normal and long axis.
No variations, however, were noted as the target was
translated toward or away from the transducer within the transducer's near field.
Following up on the observations noted above, measurements of the reflections from four
targets were taken as a function of vertical translation distance of the transducer from the
target centerline and rotational deviation from perpendicular of the transducer in the plane
containing the target surface normal and long axis. The four targets selected are shown
below in table 6-1. These objects were selected because they covered a range of materials,
diameters, and wall thicknesses. The recorded time series data and an estimate of the
frequency content of a region including the first reflection are shown for object #9 and
object #19 in figure 6-1.
The region from which the frequency content estimate was
formed started approximately 1 gsec prior to the reflection start and had roughly 17 gsec
duration. The data generated for object #27 and object #31 are very similar in nature to
those presented.
Material
Outer Diameter
Wall Thickness
9
Copper
41.10 mm
1.08 mm
19
Brass
35.64 mm
1.71 mm
27
Aluminum
31.71 mm
2.05 mm
31
Aluminum
50.80 mm
6.40 mm
Object Number
Table 6-1: Listing of the four targets for which recordings were taken over a course of precise
translational and rotational variations.
62
0
1.5
1
0 .5
-5
-10
0
-1 5
-20
E -0 .5
-1.50
1
T im
e2(usec
3
4
.5-30
0.0E+00
2.0E+06
4.0E+06
F requen cy (H ertz)
6.0E+06
Figure 6-1: Acoustical backscatter recordings from object #9 over a series of vertical translations of
the object. Object #9 is a copper pipe with 41.10 mm diameter and 1.08 mm shell thickness. In the left
column is the time series data, which has a range of 4 usec or 120 samples. In the right column is the
power spectral density computed from the time series data, which covers the range of 0 to 6 MHz. The first
row of plots corresponds to data taken with the transducer located at a height equal to that of the target
centerline. Between each row of plots, the target centerline is translated by approximately 2 mm relative to
the transducer. Due to symmetry of the pipe, the results are the same for either an upward or downward
translation. Note that the shape of the power spectral density has changed considerably by the time the
translation has reached 4 mm. The strength of the time series, however, does not decay significantly until
approximately 8 mm of translation.
63
-
1.5
0
0.5
-1 0
0
1 5
-0.5
- 20
-1-2
30
0 .0E+00
-1.5
0
1
2
3
4
Tim e (usec)
2 .0E+06
F requen
4 .0E+06
cy
(H
6 .0E+06
ertz)
Figure 6-2: Acoustical backscatter recordings from object #9 over a series of horizontal rotational
displacements of the transducer. Object #9 is a copper pipe with 41.10 mm diameter and 1.08 mm shell
thickness. In the left column is the time series data, which has a range of 4 usec or 120 samples. In the
right column is the power spectral density computed from the time series data, which covers the range of 0
to 6 MHz. The first row of plots corresponds to data taken with the surface normal of the transducer
parallel to the surface normal of the target at its centerline. Between each row of plots, the transducer is
rotated by approximately 0.2 degrees relative to the target. Due to symmetry of the pipe, the results are the
same for either clockwise or counterclockwise rotation. Note that the strength of the time series begins to
drop very quickly with a considerable decrease in signal strength occurring by a rotation of 0.4 degrees.
The form of the power spectral density, however, exhibits a high degree of consistency throughout the
rotations.
64
3
0
$
2
-5
-10
0
-1
-5
-20
-2
-3
$
0
1
2
3
4
Tim e (u sec)
-25
0 .0E + 00
2 .0E + 06
F requen cy
4.0E + 06
(H ertz)
6 .0E + 06
Figure 6-3: Acoustical backscatter recordings from object #19 over a series of vertical translations of
the object. Object #19 is a brass pipe with 35.06 mm diameter and 1.71 mm shell thickness. In the left
column is the time series data, which has a range of 4 usec or 120 samples. In the right column is the
power spectral density computed from the time series data, which covers the range of 0 to 6 MHz. The first
row of plots corresponds to data taken with the transducer located at a height equal to that of the target
centerline. Between each row of plots, the target centerline is translated by approximately 2 mm relative to
the transducer. Due to symmetry of the pipe, the results are the same for either an upward or downward
translation. Note that the shape of the power spectral density has changed considerably by the time the
translation has reached 4 mm. The strength of the time series, however, does not decay significantly until
approximately 8 mm of translation.
65
3
0
2
-5
1
-0
30
-15
1
-20
-2 5
-2
0
1
2
Time (usec)
3
4
0 .0E+00
2 .0E+06
4 .0E+06
F requency (H ertz)
6 .0E+06
Figure 6-4: Acoustical backscatter recordings from object #19 over a series of horizontal rotational
displacements of the transducer. Object #19 is a brass pipe with 35.06 mm diameter and 1.71 mm shell
thickness. In the left column is the time series data, which has a range of 4 usec or 120 samples. In the
right column is the power spectral density computed from the time series data, which covers the range of 0
to 6 MHz. The first row of plots corresponds to data taken with the surface normal of the transducer
parallel to the surface normal of the target at its centerline. Between each row of plots, the transducer is
rotated by approximately 0.2 degrees relative to the target. Due to symmetry of the pipe, the results are the
same for either clockwise or counterclockwise rotation. Note that the strength of the time series begins to
drop very quickly with a considerable decrease in signal strength occurring by a rotation of 0.4 degrees.
The form of the power spectral density, however, exhibits a high degree of consistency throughout the
rotations.
66
As can be seen in figures 6-1 through 6-4, the data for both objects shows roughly the
same pattern. Significant drops in the reflection power do not occur until approximately
8.0 - 10.0 mm of vertical translation has been introduced. However, within 3.0 mm of
vertical offset, the time series has altered in a qualitatively noticeable fashion. Further,
within 3.0 mm of movement, the frequency content of the signal has altered significantly
with peaks and troughs in the spectrum appearing and disappearing.
This finding
suggests that for object classification, especially if the frequency content of the returned
signal is to be used during the identification process, it may not be enough to simply
ensure that a specular reflection is used - it may be necessary to ensure that the
reflections are recorded from consistent vertical positions. The degree to which the above
assertion is true depends upon the source of the translational variations. The nature of the
source is speculated on later in this section.
The data in figures 6-1 through 6-4 suggest that angular variations do not follow the same
pattern as vertical translations. In particular, signal strength generally declined noticeably
within 0.4' rotation from normal. However, the reflection time series do not appear to
change qualitatively until approximately 0.8 - 1.00 rotation.
Similarly, the frequency
content of the backscattered acoustical pulse maintains much greater similarity with the
normal frequency content even as the signal strength decays. This finding suggests that
unlike with vertical translations any strong reflection will have the proper angular
alignment for object classification.
It should be noted that the reflections in figures 6-1 and 6-3 appear qualitatively different
from those in 6-2 and 6-4. The mechanical apparatus that was used to achieve the precise
angular orientations was quite large. Therefore, its use necessitated that the transducer be
located far enough away from the target that it was close to the acoustical near-field/farfield transition point as discussed earlier in section 5.5.5. This target location is the cause
of the dissimilarity between the rotational and translational data. While it was impossible
to take precise angular measurements in the near-field without the mechanical apparatus,
near-field backscatter behavior was observed throughout a range of rotations.
67
These
observations indicated the same pattern as shown in figures 6-2 and 6-4: the amplitude of
the received signal dropped off significantly faster than the frequency content of the
signal changed.
To provide a physical explanation for the observed phenomena, it is postulated that the
transfer function the acoustical signals encounter is highly dependent upon the vertical
location of incidence. Therefore, as the target is translated vertically with respect to the
piston transducer, the average transfer function seen at the face of the piston transducer
changes relatively quickly as the symmetrical appearance of the target is lost. This effect
is illustrated in figure 6-5.
Move
target
down
Figure 6-5: Illustration of the changing appearance of the target with respect to the transducer as the
target is translated vertically. The figure above depicts two scenarios. At left, the target's centerline and
the piston transducer's centerline are located at the same height. The acoustical pulse, represented as the
dotted rays in the figure above, strikes the surface of the target and is reflected back to the piston
transducer. As can be seen, the distribution of incident signal on the piston transducer will he symmetric.
At right, the target has been translated downward by a small amount. In this case, the distribution of the
acoustical signal incident upon the piston transducer is no longer symmetric. This change in the nature of
the incident signal is hypothesized to account for the rapid variation in the observed transfer function as
vertical translations occur.
The transfer function, however, does not appear to be highly sensitive to perturbations
with respect to angle of incidence in the horizontal direction. If, as conjectured in the
previous section, the alteration of the observed transfer function is due to the loss of
symmetry as seen at the face of the piston transducer, this lower sensitivity to rotational
movements would be expected. Because rotational movements keep the centerline of the
68
piston transducer in the same plane as the centerline of the target, symmetry is maintained
throughout.
Finally, it must also be recognized that the effects noted may be due solely to spatial
variations in the piston transducer beam pattern. The transmission characteristics of a
piston transducer vary spatially, with peaks and nulls in the transmission pattern
appearing and disappearing in a complex manner. Because the acoustical pulse used had
a short duration, it was broadband in frequency content. The broadband nature of the
pulse tends to smear out the peaks and nulls of the transmission pattern. A complicated,
frequency-related pattern, however, still exists.
This pattern could induce orientation
specific alterations in the backscattered signal [15, 8].
Based upon the above findings, the vertical position of the transducer was closely
monitored during the recording of the target data. As mentioned previously, because of
the symmetry possessed by the pipes and cylinders, the reflection alterations produced by
a vertical translation were symmetric with respect to the pipe's centerline. By searching
for this point of symmetry in the reflection variations, the transducer was centered on
pipe. During the course of the recordings, the transducer's vertical position was varied
within approximately ± 1.5 mm of this centerline. This variation was introduced in an
attempt to make the data gathered in this study more relevant to the AIS being developed
by Lockheed Martin. Because vertical positions can not be controlled as precisely in a
real-world environment as in the lab, this variation was introduced to reflect that realworld uncertainty. However, as shown in the data in figure 6-1 above, this degree of
variation produces only minor effects on the received reflection, and therefore should not
hinder OR system development so much as to make the task impossible.
To more
precisely explore the importance of vertical position uncertainty in the data, a second set
of data was also taken for each target. In this data set, the vertical position was varied
over a range of approximately ± 12 mm. While the OR system was developed using the
data for which the vertical position was more precisely controlled, the system was also
evaluated using the less precise data. The results of this exercise are presented in section
69
6.5.3. Further, in chapter 7, a discussion of how data with a precise vertical position may
be obtained by a real-world system is presented, along with a discussion of how precise
such an automatic positioning may be expected to be.
As previously stated in section 5.4.3, the rotational position of the transducer was not
monitored closely prior to acoustical recordings.
In between each recording, the
transducer head was swiveled so that the acoustic signal would disappear. The transducer
head was then reset, taking care only to ensure that a reflection within about two-thirds of
the maximum was received.
6.3 Object recognition feature selection
Early in the OR system development process, it was recognized that a universal set of
features which could be extracted once and allow precise determination of the identity of
the target was beyond reach. The reason for this is simple: the objects to be identified
were often very different. Therefore, features that were quite significant in identification
of some objects were not only worthless but also detrimental to identification of others.
Therefore, the objects were broken into classes, and for each class a family of attributes
was developed that allowed the members of the class to be discerned from the other
targets.
70
Recorded
data
for
object
#29,
a
thick-shelled
alum
inum
p ip e
2 .5
2
1 .5
0 .5
0
0 .5
-1
1 .5
-2
-2 .5
2 .4 0 E -0 4
2 .5 0 E -0 4
2 .6 0 E -0 4
2 .7 0 E -0 4
T is
R e c o r d e d
fo r
o b je c t # 2 6 .
a
e
(s e c
2 .8 0 E -0 4
2 .9 0 E -0 4
)
t h in - s h e lIe d
a
lu
m in u m
p ip e
2
1 .5
0 .5
-1
0
7
-0 .5
-1 .5
-2
2 .90 E -04
3.00E -04
3.10E -04
3.20 E -04
3.30E -04
T im e (sec)
Recorded
data
for
object
#35,
a
solid alum
inum
c y lin d e r
1 .5
1
0 .5
0
-0 .5
-1
1
-1 .5
2 .7 0 E -0 4
2.8 0 E -0 4
2 .9 0 E -0 4
3.0 0 E -0 4
3.1 0 E -0 4
3 .2 0 E -0 4
T im e (sec)
Figure 6-6: An example of a thick-shelled object (top), a thin-shelled object (middle), and a
cylindrical object (bottom). Each of the above three recordings is typical of its class. Comparison of the
top two recordings quickly establishes the essential difference between thick and thin-shelled objects:
whereas individual reflections may be identified for thick-shelled objects, only complex reflection regions
may be identified for thin-shelled objects. Cylindrical objects were assigned to their own class because of
the relative sparseness of their data sets and because of the lack of the same structural features as exhibited
by pipes.
71
Three classes of objects were used: thick-shelled objects, thin-shelled objects, and solid
cylinders. Figure 6-6 shows an example recording for each of the three classes.
An
object was considered to be a thick-shelled object if its walls were thick enough to
produce distinct reflections. Because the acoustical pulse used for the experiments was
approximately 1.4 gsec in duration, for a target to be considered thick-shelled, the wall of
the target had to be wide enough that sound required 1.4 gsec to traverse twice its width.
Further, because sound travels at different speeds in different materials, the distinction
between a thick and a thin shell depends not only on the shell thickness but also on the
material composing the shell. For the materials used in the experiments, table 6-2 lists
the thickness required to be considered a thick-shelled object.
Sixteen objects were
assigned to the thin-shelled class, fifteen objects to the thick-shelled class, and eight
objects to the solid cylinder class. Note that the two metal plates were both assigned to
the thick-shelled class.
Aluminum
6.40 mm/sec
4.48 mm
Brass
4.42 mm/gsec
3.09 mm
Copper
5.01 mm/tsec
3.51 mm
PVC
2.38 mm/gsec
1.67 mm
Table 6-2: Pipe wall thickness required to be considered a thick-shelled pipe for various materials
For each class of objects, a family of classification features was developed to exploit the
material and structural properties of the target. Further, estimates of an object's transfer
function, which depends upon the material and structural properties of the object, were
also used in differing ways by each class of objects.
72
6.3.1 Material-based classification features
6.3.1.1 Thick-shelled targets
Just as Snell's Law can be used to determine the direction of propagation of scattered
waves at material interfaces, the acoustical Fresnel equations describe the relation
between the amplitude of scattered waves and the amplitude of the incident waves. The
acoustical Fresnel equations allow the determination of the reflection coefficient and the
transmission coefficient for a wave incident upon an interface. These coefficients, in
turn, can be used to determine the amplitude of the reflected and transmitted portions of
the scattered wave, in terms of the amplitude of the incident wave. In general, these
coefficients are complicated functions of the type of wave (longitudinal or shear) that is
incident, the types of waves produced, the angle of incidence, and the materials
composing the interface [16].
For the Acoustical Imaging System, all the imaged objects will reside in water. Since
liquids support only longitudinal and not shear waves, the acoustical situation is thus
somewhat simplified. Therefore, all incident waves are longitudinal. While shear wave
propagation may occur within the solid as the result of scattered acoustical energy from a
non-perpendicular wave incidence, all waves recorded at the transducer will also be
longitudinal. Further, because firm restrictions have been placed upon the orientation of
the transducer and the target for experimental data acquisition as discussed in section 6.2,
there is only small variation in terms of the angle of incidence of the acoustical waves on
the target. While small variations in angle of incidence can cause wild variations in the
reflection and transmission coefficients, the acoustical picture is straightforward enough
that it may be possible to extract a simple statistic from the reflected time series that will
shed light on the material of the target [16].
Figure 6-7 shows example recordings from PVC, aluminum and brass pipes. As can
readily be seen, the ringing patterns observed for these pipes are quite distinct. The
ringing from the PVC pipe dies very quickly. The ringing from the aluminum pipe is
73
much more persistent. The brass pipe, however, maintains the most energy in its ringing
over time.
After examining thick-shelled material time series like those shown above, through trial
and error a statistic was found that could serve as an excellent predictor of material from
recorded data. The derived statistic is the ratio of the maximum absolute value of the
3 rd
reflection convolved with a matched filter and the maximum absolute value of the
2 nd
reflection convolved with the same matched filter.
Shown below in figure 6-8 is a
schematic illustration of the method by which the material determination statistic was
computed.
74
Tim e
series
data
about
first
reflection
for
object
#5
-
a
PV C
pipe
1 .5
0 .5
0
-0 .5
2.8 4 E -0 4
2.8 8 E -0 4
2.9 2 E -0 4
T in
T in
e
s e r ie
s a b o u
t
fir s t
r e fIe c t io n
e
2.9 6 E -0 4
3 .0 0 E -0 4
(se c)
fo r o b je c t
# 3 3
- a n
a lu m in u m
p ip e
1 .5
1
0 .5
0
-0 .5
-1
-1
.5 2
2 .5 2 E -0 4
2.5 6 E -0 4
2.6 0 E -0 4
2.6 4 E -0 4
2.6 8 E -0 4
T im e (sec)
Tim e
series
data
about first
reflection
for
object
#22
- a
brass
p ip e
1 .5
1
0 .5
0
-0 .5
-1
-1
*-
.5
2 .60E -04
2.64E -04
2.68 E -04
2.72E -04
2.76 E -04
T im e (sec)
Figure 6-7: Example recordings from thick-shelled PVC, aluminum, and brass pipes. The three
figures above illustrate two important points. The first point is that the ringing from brass appears to be
more persistent than that from either aluminum or PVC. Likewise, aluminum appears to ring longer than
PVC. The second point is that the relationship between the first reflection (which is from the front edge of
the front wall of the pipe) and the second reflection (which is from the back edge of the front wall of the
pipe) is complicated. Simply because an object exhibits a strong second reflection does not mean that it
will ring well. Furthermore, examination of the data indicates that there is a high degree of variability in
the relationship between these two reflections, even within the recordings for a single object. The
relationships between the second, third, and subsequent reflections (as long as they are still reflections from
the back edge of the pipe's front wall) are much simpler and more consistent. Therefore, the ability of a
pipe to ring is best characterized by examining these later reflections.
75
n
Find the maximum absolute
value of the convolution over
a 2 psec window whose
center is located as far after
the 2 nd reflection as the 2 nd is
Find maximum
absolute value of
convolution between 1
and 20 ssec after first
Find maximum
absolute value of
Convolution with
matched filter
at
Data
convolution - this is
the 1 " reflection!
Measu re the time from the 14
tc the 2 "dreflection.
reflection - for thick-
4-
shelled objects, this
value corresponds to
the 1streflection from
the back edge of the
pipe's front wall (2 nd
overall).
located after the 1 ". For
thick-shelled objects, this
value will correspond to the
2 nd reflection from the back
edge of the pipe's front wall
( 3 rd
overall).
Figure 6-8: A schematic illustration of the method by which the material determination
statistic is computed.
Unfortunately, the thick-shelled targets were composed of only the three materials shown
above. While the number of materials was quite limited, each of these three produced
As will be discussed later, the thirty data
very different results for the test statistic.
recordings for each of the fifteen thick-shelled objects were divided into training and test
sets, with fifteen recordings assigned to each set. With the exception of one outlier from
the aluminum recordings, all of the ratios for the training data fell into non-overlapping
segments.
Table 6-3 lists the range of maximum and minimum mean ratios for each
material. Also included is the standard deviation of the ratio values for the object that
produced this maximum or minimum.
76
Maximum
mean ratio
Material
Standard
devaition of
the
Object
producing
maximum
Minimum
mean ratio
Standard
devaition of
the
Object
producing
minimum
maximum
maximum
PVC
0.0686
0.0056
7
0.0440
0.0028
1
Aluminum
0.6819
0.0237
37
0.5130
0.0120
34
Brass
0.8307
0.0164
25
0.7594
0.0125
20
Table 6-3:
3rd
to 2 "dreflection ratio statistics for the training data examined.
The variations in the reflection ratio data are most likely due to unaccounted for angle of
incidence and shear wave effects.
Because of the short distances over which the
acoustical propagation is occurring, these differences are most likely not due to loss
during propagation. This assertion is borne out by the data. While for PVC the object
producing the maximum ratio is the thinnest and the object producing the minimum ratio
is the thickest, this pattern does not follow for aluminum and brass. For example, object
#32, an aluminum pipe which at 24.78 mm thick is by far the thickest aluminum pipe, has
the third greatest
3 rd
to 2 nd reflection ratio of the six aluminum pipes.
While the
variations are significant, the data definitely indicates that there is a good deal of
information that can be gleaned from data of this type. Further, it appears that a major
source of variation in reflection ratios is the shape of the object. While more data is
necessary to confirm this suspicion, the objects with the greatest reflection ratio for both
aluminum and brass are the flat plates. Indeed, for aluminum, the deviation between the
reflection ratio of the flat plate and the second greatest reflection ratio is approximately
three times as great as the variation amongst the aluminum pipes. Because of the clear
separation amongst the ratio ranges of the materials, in the experimental system, the
3 rd
to
2"d reflection ratio data could be translated directly into a material assignment. For more
complex situations in which there are not clearly separable material ratio ranges, such
information could be used to restrict the possible materials to be considered.
77
6.3.1.2 Thin-shelled targets
Because of the complex nature of the reflections encountered with thin-shelled targets, no
simple and general statistic could be found that would produce an estimate of the target's
material composition. While it was noted that the ringing from copper pipes appeared to
be more persistent than that from brass pipes which in turn appeared to ring longer than
aluminum pipes, these observations were difficult to quantify. Moreover, translation of
this insight into a computer program proved infeasible. Therefore, a composite statistic
was developed that is dependent upon both the material and structural properties of the
target. This statistic will be discussed later in section 6.3.2.2, which covers structural
classification criteria for thin-shelled targets.
6.3.1.3 Cylindrical targets
While there certainly is clear separation between the reflections produced by the
cylindrical targets, data to allow material determination using a technique similar to that
used by the thick-shelled cylinders was not available. Because of memory constraints, the
recorded data was occasionally truncated prior to the second reflection from the back wall
of the cylinder. Further, by the time that it became apparent that use of the ratio of the
and
2 nd
reflections (which correspond to the
2 nd
and
1 st
3 rd
reflections from the cylinder back
wall) could be useful in material determination, the experimental set-up was no longer
available. Additionally, as clearly evidenced in figure 6-11 shown later, the acoustical
environment for solid cylinders can be relatively complicated.
Many reflections are
apparent in addition to longitudinal wave reflections from the various interfaces. Sources
for these reflections include shear waves, creeping waves, and surface waves [17].
Therefore, no material information could be extracted from the data recorded on the
cylinders. Finally, it should be noted that some of these more complex features have been
observed in the thick-shelled pipes. They are much less prominent, however, and do not
complicate the identification of structural features to a high degree.
78
6.3.2 Structure-based classification features
6.3.2.1 Thick-shelled targets
For thick-shelled targets, two structural features can be easily extracted. They are the
pipe shell thickness and the inner diameter of the pipe. Both could be measured to a
precision on the order of 0.1 mm and were very useful in target identification. Figure 6-9
exhibits which features in a typical thick-shelled data recording correspond to these
structural features.
2
1 .5
1
-s
0.5
0
0.5
-1
1 .5
L
-2
1 .80E-04
2.00E-04
2.20E-04
2.40E-04
2.60E-04
2 .80 E -04
T im e (sec)
Figure 6-9: Demonstration of which features in a typical thick-shelled recording correspond to the
structure-based features of shell thickness and pipe inner diameter. The data shown above was
recorded from object #34, an aluminum pipe with a shell thickness of 12.75 mm and an inner diameter of
50.81 mm. As illustrated in the figure, a thick-shelled pipe's wall thickness may be determined by
examining how long after the first reflection (which corresponds to the front edge of the front wall of the
pipe) the second reflection (which corresponds to the back edge of the front wall of the pipe) occurs. The
inner diameter of the pipe may be determined by locating how long after the reflection from the back edge
of the front wall the reflection from the front edge of the back wall occurs. While the algorithm that
extracts these features uses a matched filter as explained in this section, the data shown above has not yet
undergone this process.
The first step in measuring the thickness of the pipe wall is to convolve the data with a
matched filter (which is a time-reversed version of the strongest portion of the transmitted
pulse). The start of the pipe's front wall reflection is then found by searching for the
maximum absolute value of the convolved data.
Because it is assumed during the
application of this criteria that the data originated from a thick-shelled pipe, it is known
that the 1st back wall reflection must occur at least 1.4 gsec after the front wall reflection.
79
Further, because it is known that the maximum thickness of any of the thick-shelled pipes
is 24.78 mm and occurs for target #32, an aluminum pipe, the 1 st back wall reflection
must occur within 6 gsec of the front wall reflection.
Therefore, the convolved data
between 1.4 gsec and 6.0 gsec after the front wall reflection is searched for the maximum
absolute value. The location of this local maximum is considered to be the location of the
Vt back wall reflection.
1
reflection and the
1 st
Dividing the number of samples between the front wall
back wall reflection by the sampling rate and then multiplying this
value by the speed of sound in the material yields the wall thickness.
A similar procedure is used to find the inner diameter of the pipe. This task is slightly
complicated by the fact that the phase information available through the experimental
system is somewhat difficult to use. This difficulty in the phase recovery arose because
the sampling times are dictated by an external clock that has no relation to the pulse's
central frequency. Therefore, phase information was deemed not to have a high enough
value to merit the level of effort required to recover it. Although it was possible to work
around the lack of phase information, inner diameter determination
should be
considerably easier for data acquired through the AIS. With the AIS, phase data can be
quickly computed from the quadrature samples. This phase data could then be used to
tell if a reflection is from an interface in which sound is passing from a material of higher
to lower acoustical impedance or lower to higher acoustical impedance.
With such
information, reflections from the back edge of the front wall of the pipe could quickly be
identified and ignored during the search for the first reflection from the front edge of the
back wall of the pipe.
To find the inner diameter of the pipe, the front reflection is again located. Because of
knowledge of the large shelled pipes in the problem space, it is known that no back wall
reflections will occur within 15 gsec of the front reflection. Therefore, from 15 gsec after
the front reflection until the end of the recording is examined in the matched filtered data
to find the location with the largest absolute value. Ignoring the first 15 gsec after the
front reflection serves two purposes: 1) by ignoring data that could not possibly contain
80
the back wall reflection, the amount of work to be performed is decreased and 2) during
the 15 gsec after the front reflection, the reflections from the back side of the front wall
have time to die down to a level at which the front reflection of the pipe's back wall
should be larger.
To ensure that the reflection of the front edge of the back wall has been found instead of
just more ringing from the front wall, data preceding the presently located data by a
period of time equal to the propagation time through a pipe wall is examined. If this data
has a strength that relates to the data just located by a relationship that matches the
material's reflection ratio constraint, then the currently identified reflection is simply
ringing from the front wall.
If no such relationship exists, however, the currently
identified location is determined to be the location of the back wall. Using the location of
the back wall and the previously identified position of the back edge of the front wall, the
inner diameter of the pipe is calculated. If no back wall data can be identified, the present
object is assumed to be a plate.
6.3.2.2 Thin-shelled targets
As stated previously in section 6.3.1.2, because of the complexity of the backscattered
signal from a thin-shelled target, formation of an estimate of the object's material is
essentially impossible. For the same reason, precise determination of the time at which
the
1
't reflection from the back surface of the pipe's front wall occurs is quite difficult.
While a sophisticated technique involving deconvolution may be able to determine this
time, no success was achieved during the work on this thesis.
The ringing pattern
following the first reflection, however, is highly variable across pipes, yet essentially
invariant across recordings for a particular pipe. Therefore, it was felt that a criterion that
captures information about this ringing would be helpful in discriminating amongst the
thin-shelled objects.
Examination of the thin-shelled target reflection data indicated that for all objects, ringing
persisted at a level greater than the background noise for at least 6 gsec. Further, during
81
the first 2 gsec after the start of the first reflection, the backscattered signals all appeared
about the same. Therefore, the data in the range 2 - 6 gsec following the first reflection
is used to characterize the ringing.
The first step in the characterization process is to convolve the reflection data with a
matched filter. By finding the maximum absolute value in the resulting data, the location
of the first reflection is found. The convolved data is then broken up into four sections: 2
- 3 psec, 3 - 4 gsec, 4 - 5 gsec, and 5 - 6 gsec. The sum of the squares of the convolved
values is computed for each region (these will be referred to as SSQ1, SSQ2, SSQ3, and
SSQ4 respectively). A set of ratios of the sum of squared values is then used as a criteria
which captures effects caused by the pipe material and wall thickness. Six independent
ratios of two regions' sum of squares values can be formed from the four regions listed
above. They are SSQ1:SSQ2, SSQ1:SSQ3, SSQl:SSQ4, SSQ2:SSQ3, SSQ2:SSQ4, and
SSQ3:SSQ4. From these six ratios, a total of sixty-four sets of ratios could be formed.
These span from the empty set to a set that includes all six of the ratios.
To use a ratio set for discrimination, a template matching score is produced will be done
later with the object recognition prototype. First, the value of each ratio in the set is
computed for an unknown target. These values are then compared against the mean and
standard deviation ratio values for each of the ratios that is stored in a template formed
from the training data. These comparisons are used to produce the average number of
standard deviations that each ratio differs from the template mean.
This number of
standard deviations is the template matching score.
Each of these sixty-four ratio sets was evaluated based upon its discriminatory power.
The measure of discriminatory power used was area under the receiver operating curve
(ROC).
A receiver operating curve plots the probability of a false alarm versus the
probability of a detection for a particular object over all possible detection thresholds for
a particular discriminatory test. The area under the ROC then gives a measure of the
discriminatory power of a test for a particular object: the greater the area under the ROC
82
(up to a maximum of one), the greater the power of the test. For a particular test, an ROC
could be drawn for each object. To create the ROC for an object, a ratio set template was
formed from the training data for that object. A template scoring match value was then
computed for every piece of training data. By varying the detection threshold, an ROC
was plotted for that ratio set for that object. The area under this ROC was then computed.
For a particular set of ratios, the average, maximum, and minimum area under the ROC
over all objects was computed. Further, the standard deviation of the areas under the
ROC over all of the objects was computed for each set of ratios.
Following evaluation using this method, a ratio set composed of 2 - 3 sec:3 - 4 gsec, 3
- 4 gsec:5 - 6 gsec, and 4 - 5 psec:5 - 6 gsec was selected. This ratio set was selected
for two reasons, simplicity and discriminatory power.
Of the sixty-four ratio sets
evaluated, the selected set performed the seventh best, with an average area under the
ROC of 0.988, a standard deviation of 0.018, a maximum of 1.000, and a minimum of
0.942. Further, the selected criteria had the best performance of any ratio set comprised
of three or fewer ratios in terms of mean area under the ROC and minimum area under
the ROC.
To summarize, the criterion developed in this section first locates the start of the first
reflection by finding the maximum absolute value of the recorded data convolved with a
matched filter. The sum of squared values in the convolved data is then calculated for the
spans 2 - 3 gsec, 3 - 4 psec, 4 - 5 gsec, and 5 - 6 gsec following the first reflection start
(these will be referred to as SSQ1, SSQ2, SSQ3, and SSQ4 respectively).
The ratios
SSQ1/SSQ2, SSQ2/SSQ4, and, SSQ3/SSQ4 are then computed and together serve as the
criterion.
83
2
Ii
-
-1
I
t-
t
5
-2
2 .9 0 E -0 4
3 .0 0 E -0 4
3 .1 0 E -0 4
T im
e
3 .2 0 E -0 4
3 .3 0 E -0 4
(se c)
Figure 6-10: Illustration of the regions of data that correspond to the features extracted for a typical
thin-shelled object data recording. The data shown above were recorded from object #16, a brass pipe
with shell thickness of 1.62 mm and an inner diameter of 12.62 mm. Note that the data has not yet
undergone matched filtering. The four brackets above denote the regions used in the ratio formation. The
long two-headed arrow shows roughly what the length of time from the start of the first reflection to the
strongest portion of the backscatter caused by the pipe's back wall might be.
For thin-shelled objects, it is also possible to get consistent measurements of the length of
time from the first reflection to the strongest portion of the backscatter caused by the back
wall of the pipe. Because the material of the target is not known, it is not possible to
form an inner diameter estimate from this time data, however, this time information
provides a strong constraint on the classification possibilities. Figure 6-10 illustrates the
regions of data that correspond to the features extracted for a typical thin-shelled object
data recording that has not yet undergone matched filtering.
6.3.2.3 Cylindrical targets
Because of the lack of material information for cylindrical targets as discussed in section
6.3.1.3, the exact thickness of a cylindrical target cannot be determined. However, the
time at which the second reflection occurs can be precisely determined by searching for
the first strong reflection following the front wall reflection.
Currently, this task is
implemented by matched filtering the data, finding the first reflection, and then searching
for the first location in the matched filtered data whose absolute value exceeds a
84
threshold. Much like with thin-shelled targets, this time information provides a strong
constraint on the classification possibilities. Figure 6-11 shows an example of the time to
back surface feature on a typical cylindrical data recording prior to matched filtering.
2
1
.5
1''
0 .5
,~
0
-0 .5
T imne f r om ecyIi nd e r fr Ln t e dg e
t io n
re fle c tio n _to ba1ic k e d ge ru fec
-1
-1 .5
-2-
2 .90E-04
3.00E -04
3.10E -04
3 .20E -04
3 .30E -04
T im e (sec)
Figure 6-11: Example of the time from cylinder front edge to back edge structural feature from a
typical cylindrical data recording prior to matched filtering. The data shown above was recorded from
object #21, a brass cylinder with diameter 12.66 mm.
As previously mentioned in section 6.3.1.1, the acoustical recordings from cylindrical
objects can include considerable complexity. The recording shown above in figure 6-11
is typical of this class. Through knowledge of the cylinder thickness and the speed of
sound in brass, it can be determined that the reflection located at approximately 305 gsec
is the first longitudinal wave reflection from the back surface of the cylinder. For all
cylinders used, the first relatively strong reflection is the first longitudinal wave reflection
from the back wall. Indeed. barring reflections from internal structure. the first relatively
strong reflection following the front wall reflection will always be the first longitudinal
wave reflection from the back wall, as longitudinal wave velocities are greater than both
shear and surface wave velocities.
Several other features may be identified in the
recording shown in figure 6-11. The reflection at 310 sec is the result of the first shear
wave reflection from the cylinder's back surface. The smaller reflection located just after
that reflection is the second longitudinal wave reflection from the back surface of the
cylinder. The larger reflection starting at approximately 317 gsec is due to surface waves.
Finally, other features present are a complicated mixture of higher order reflections of the
types just mentioned, creeping waves, and reflections from the internal structure of brass.
85
6.3.3 Frequency domain-based classification features
Examination of the frequency content of the backscatterred acoustical signals indicated
that physical processes manifested themselves identifiably in the frequency domain. For
example, resonant frequencies with respect to the pipe wall appeared as troughs in the
frequency content of the return. This phenomenon occurs because energy at the pipes
resonant frequencies rings easily in the pipe walls. Therefore, this energy stays trapped
within the pipe, which results in a deficit in the amount of energy at the resonant
frequencies that is returned to the transducer. While this information could manually be
used for pipe wall thickness determination for some pipes, other frequency content effects
of unidentified physical origin often interfered with this process. Therefore, it was not
possible to exploit frequency content information for uses such as automatic wall
thickness determination for thin-shelled pipes.
Despite confounding attempts to use the frequency domain to determine pipe shell
thickness, the frequency-content effects of unknown source were consistent and showed a
low degree of variation, especially within regions of the frequency spectrum. Matching
the frequency content of specific regions of an unknown target against templates
individualized for each known object proved highly valuable in the identification task.
To make the object frequency content templates more general, the templates were based
upon the transfer function of the object. Recall that for an LTI system, the output is just
the convolution of the input and the system's transfer function. In the frequency domain,
this corresponds to the output being the multiplication of the input by the system's
frequency response. Luckily, it has been assumed that the AIS and the experimental
system are LTI. Further, the transmitted acoustical pulse is known and the backscattered
acoustical signal can be measured. Therefore, an estimate of the frequency response of
the system can be generated by dividing the received frequency spectrum by the
frequency spectrum of the transmitted acoustical pulse. While this estimated frequency
response includes effects of transmission through water, it also contains information
86
about the target. For the experimental set-up, the impact of the transmission through
water on the frequency content was essentially unchanged over all recordings. Therefore,
the impact of water on the transfer function could be ignored. However, if it is later
found alterations in water temperature, pulse transmission distance, and water particulate
content substantially degrade the target transfer function estimate, corrections can be
made for these effects.
For frequency response calculations 500 data points, or 16.667 gsec of recorded data,
were used. It is important to remember that by using such a large range of data, the
acoustical energy that is recorded may have passed through differing "systems".
For
example, within 17 gsec for a thick-shelled object, reflections from the front edge of the
front pipe wall and reflections from the back edge of the front pipe wall will both be
recorded. The backscattered energy resulting from each of these interfaces will have
experienced different transfer functions.
Therefore, references to a single transfer
function estimate are fast and loose with terminology, but avoid the complication of
having to continuously refer to the average frequency response encountered by the
acoustical pulse for the data recorded over some period of time.
In all cases, the frequency response estimate of a target was calculated in the same
manner. The power spectral density of this signal was calculated using Welch's averaged
periodogram approach. In this approach, the 500 data points were broken up into twentyone evenly spaced overlapping sections of length 100 (this corresponds to an overlap of
80 data points). Each of the sections was then filtered with a Hanning window to reduce
the frequency content impact of the truncation process. Each section was zero-padded to
length 1024, and the magnitude-squared Discrete Fourier Transform (DFT) of each
section is computed using the Fast Fourier Transform (FFT) algorithm. Following FFT
computation, the twenty-one regions are averaged to produce the estimate of the
frequency content of the signal. The data was then converted into decibels (dB) by taking
the logarithm and multiplying that result by ten. Following this conversion, the transfer
function estimate in dB was then calculated by subtracting the acoustical pulse frequency
87
content in dB from the received signal frequency content in dB. Figure 6-12 shows a
typical complete transfer function estimate for an object. Marked off within the figure are
also the regions of interest that are used as the transfer function template for the object.
.. .....
2
-
0
-4
2
-6
-8
-1 0
0.0E+00
1 .0E+06
2.0E+06
3.0E+06
4. 0E+06
5. 0E+06
6.0E+06
Frequency (H ertz)
Figure 6-12: A typical complete transfer function estimate. The data shown above is derived from data
taken for object #25, a brass plate with thickness 6.35 mm, height 225 mm, and width 125 mm. The
recordings were made at the center of the plate. Also shown above are the two regions of the transfer
function estimate that are used as templates during transfer function estimate similarity determination. They
are denoted by the numbers one and two in the stars and span the entire region between the dotted lines as
indicated by the two-headed arrows with which they are associated. These regions were chosen because of
two reasons: 1) they were not generally present for all objects, and 2) they were very consistent in form
across all of the training data for object #25.
While the method used to compute the transfer function estimate of a target was the same
in all cases, the data used to calculate this estimate differed by object class. It was found
that the frequency content of the received signal contained varying amounts of
information at different times depending upon the class of the object. The following
sections describe the different ways in which frequency-content information was used for
each class of objects.
6.3.3.1 Thick-shelled objects
For thick-shelled objects, data in the range of 20 samples prior to the start of the first
reflection to 479 data points after the start of the first reflection were used to create the
88
frequency response estimate. This data includes the first reflection and ringing from the
back edge of the front wall of the pipe.
6.3.3.2 Thin-shelled objects
For thin-shelled objects, data in the range of 20 samples prior to the start of the first
reflection to 479 data points after the start of the first reflection were used to create the
frequency response estimate. This data includes the first reflection and ringing from the
back edge of the front wall of the pipe. For pipes with an inner diameter beneath
approximately 25 mm, this range will also generally include at least a bit of the acoustical
backscatter caused by the back wall of the pipe.
6.3.3.3 Cylindrical objects
For cylindrical objects, data in the range of 20 samples prior to the start of the first
reflection from the back edge of the cylinder to 479 data points after the start of this back
edge reflection were used to create the frequency response estimate. While excluding the
first reflection, this data includes the first reflection from the back edge of the cylinder.
As mentioned previously, the backscattered acoustical energy for the solid cylinders can
include a great deal of complexity. Therefore, acoustical backscatter originating from
shear wave reflections, creeping waves, and surface waves may be included. Finally, this
range also generally includes the second reflection from the back edge of the cylinder if
the data recording has been carried out over sufficiently long period of time.
6.3.4 Summary of target classification criteria
Depending upon the class of an object, between two and four criteria are used in the
creation of a template to characterize the acoustical returns expected from that object.
Table 6-4 summarizes the classification criteria used for templates of the different object
classes. Further, an equation-based explanation of each of the nine criteria listed below is
included in appendix E.
89
1) Material determination
2) Pipe shell thickness estimate
3) Pipe inner diameter estimate
1) Signal strength ratios for time
ranges shortly after the first
reflection
2) Length of time from the first
reflection to the peak strength of
the pipe back wall reflections
3) "Transfer function" estimate
from data around first reflection
1) Length of time from
reflection to the first
back edge reflection
2) "Transfer function"
from data around first
back edge reflection
the first
cylinder
estimate
cylinder
4) "Transfer function" estimate
from data around first reflection
Table 6-4: Summary of the criteria used to determine the similarity between a target and an object of
each class
6.4 Object recognition prototype presentation
The object recognition prototype consists of two stages: 1) model-building from training
data and 2) classification from testing data. As stated previously, thirty recordings were
taken for each object.
These recording sets were broken up into training and testing
subsets, each containing fifteen recordings.
From the training data, a class-specific template for each object was created.
In this
template is stored the average score of each member of the object's training set on each of
its criteria. The standard deviation over the training set of the scores for each criteria are
also recorded. Additionally, contained in each template is a listing of each frequency
region that was determined to be of classification interest in the transfer function
estimate. Finally, the average transfer function value as determined from the training data
is stored in the template. The transfer function region listings and average region values
are contained in the template so that a general function can be used to perform the
frequency domain template matching instead of requiring the development of a new
function for each object.
After template formation has been concluded for each of the objects, all of the targets that
are in each object's testing set are classified. As stated previously in section 6.3.1.1, the
thick-shelled pipe material determination algorithm was found to perform at essentially
100% accuracy for the training data. Additionally, it was found to classify 100% of the
90
cylindrical targets as having an unknown material for the training data. Over the training
data, the results of the algorithm were highly variable for the thin-shelled objects.
Therefore, to reduce the amount of computation required, the first step in the
classification algorithm is performance of the thick-shelled pipe material determination
algorithm. If the algorithm produces a material classification, the thick-shelled and thinshelled object templates are checked. If the algorithm decides that the targets material is
unknown, the thin-shelled and cylindrical object templates are checked.
To check a template against a target, the classification algorithm performs each of the
classification criteria for the object template's class on the target data. The results of each
of the tests is then compared with the object mean and standard deviation stored in the
template to compute the number of standard deviations by which the target data differs
from the template. The average number of standard deviations by which the target data's
test result differs from the template's mean for each of the criteria is then that target's
score for the template. After the target has been compared against each template, a
classification is made based upon which template produced the lowest score for the
target. Shown below in figure 6-13 is a schematic illustration of the object recognition
prototype.
91
For a particular
--------------------Apply the
thick-shelled
object class's
material
determination
algorithm to
the recording.
object, perform the
appropriate
classification criteria,
as described in
section 6.3, on every
member of the
object's training set
sets
L
- . - - - - - - -
'41m
Repeat the
preceding two
steps until they
have been
completed for
every object.
For a
particular
object, select
a particular
element of
its testing
array.
Calculate the
mean score
and the
standard
deviation of
the scores for
each criterion
for the object.
Store these
values in the
object
template.
Training
Score the test
*
Did the material
determination
algorithm return
es
Perform all thinshelled object and
cylindrical object
esNt
tests on the recording
No
Perform all thin-shelled object
and thick-shelled object tests
on the recording.
0I
results against
Classify the
each template
by computing
the average
number of
recording
according to which
template received
the lowest score.
standard
deviations the
recording
differed from
the template
mean for all
criteria.
J
Repeat the preceding
six steps until each
member of the
object's testing array
has been classified.
Repeat the preceding seven steps until each
object has had all of the members of its testing
classified.
Testing
Figure 6-13: Flow diagram of the object recognition prototype program.
92
6.5 Object recognition prototype performance
The performance of the object recognition prototype is presented for four cases: 1) closed
problem space with highly controlled target/transducer orientation and no noise added, 2)
closed problem space with highly controlled target/transducer orientation and Gaussian
noise added, 3) closed problem space with loosely controlled target/transducer orientation
and no additive noise, and 4) open problem space with highly controlled target/transducer
orientation
and no
additive
noise. The terms highly
and loosely controlled
transducer/target orientations refer to the data sets discussed at the end of section 6.2. For
the highly controlled orientation data set, the vertical position of the transducer centerline
was varied about ± 1.5 mm of the centerline of the pipe. For the loosely controlled data
set, this variation reached ± 12 mm. In neither of the data sets was the horizontal position
of the transducer controlled beyond assuring that a specular reflection was received. For
each of these cases, the accuracy of the system will be presented. To further illustrate the
discriminatory power of the prototype, confusion matrices will be presented for some of
the cases.
6.5.1 OR performance: closed problem space, highly controlled orientation, no noise
added
When the orientation of the transducer and the target was closely controlled, the system's
performance was quite good. The prototype achieved 97.78% classification accuracy,
incorrectly classifying only 13 out of 575 targets.
confusion matrix for this case.
93
Shown below in table 6-5 is the
Classification
Jl
34 35 36 37 38 39
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
15
15
15
15
11
4
155
15
15
15I
14
155
;5
15
U
0
15
13
2
------------- --- --- - --
3
-
.......-- - ......
....--
-
1-.........-
--.- L ---.--
...------.....-
15
-.
. - -
...............-
i5
15
15!
i 15
14
15
15
15
'II
115
14
15!
15
15
15
Table 6-5: Confusion matrix for the object recognition prototype output. The row headings correspond to
the actual identity of a target. The column headings correspond to the classification that a target received. The
numbers that appear in a specific cell at row i, column j in the table correspond to how many targets of identity i
received classification j.
94
6.5.2 OR performance: closed problem space, highly controlled orientation, noise
added
The prototype behaved as expected when its data was subjected to Gaussian additive
noise. Performance degraded as more noise was added.
Note that even the lowest
amount of noise added to the data, Gaussian with zero mean and 0.01 standard deviation,
is greater than the noise inherent in the system. Table 6-6 below presents the accuracy of
the system under various levels of additive noise. All noise is Gaussian with zero mean.
To increase the noise level, the standard deviation of the Gaussian distribution was
increased.
Standard
0.00
0.01
0.02
0.05
97.78%
95.73%
84.62%
44.44%
deviation of
added noise
Accuracy
Table 6-6: Performance statistics for object recognition prototype at varying levels of added
Gaussian noise.
Confusion matrices were also prepared for the two lowest noise levels. Tables 6-7 and
6.8 present this data. When misclassifications did occur, the incorrect classification that
the prototype produced was skewed towards a small number of objects. Specifically,
most of the misclassifications erroneously identified the target as either object #11, 18,
23, or 24. This phenomenon is largely attributable to the large standard deviations the
templates for these objects were assigned during the training process. The criteria scores
that a target received were often much closer to the mean values stored in the template for
the actual object than the mean values stored in the template for the misclassification.
Because the standard deviations in the latter, however, were much greater than the
former, the target was incorrectly identified. This source of mistakes could possibly be
lessened by weighting the scores of a template match by a factor related to the size of the
standard deviation.
95
111
1
2
1
2
3
4
5
7
8
Classification
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
15
-
15
15
1 14!
13
2
15
15
9
10
i5
12
15
13
5
14
14
15
5
16
~
17
0
pa
15
19
20
-
15
18
0
15
2
10!
21
3
14
22
15
23
15
24
15
25
14!
26
27
~
15
213
28
29
-
15
15
3
12
30
15
31
~ -
IS
32
.3
33
15
34
35
5
36
~
37
--
-
-5
38
9
139
6
151
Table 6-7: Confusion matrix for the object recognition prototype output in which Gaussian noise with
standard deviation of 0.01 has been added to the data. The row headings correspond to the actual identity of
a target. The column headings correspond to the classification that a target received. The numbers that appear
in a specific cell at row i, column j in the table correspond to how many targets of identity i received
classification j.
96
Classification
1
2 3
4 5
6
7
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
15
15
14I
15
14!
13
2
5
-I-----------------------------------------------
5i
51
5
5
1 13
44,
3
2
14
14
0
-
-
-
-
-
2
------------
13
---------------I.-.....-..._.
------------.-----.-.--.....-.-...
.
- - - - - -- -L- - - - - - -- t
A3
'U
2
- - -- - -
11 4
15
'12
2
I
3
11
;5
14
2
13
15
3
15
---
- -- --- ~ -~- - - ~-
-3-- - - - - -
3
-----
17
- - - - ----
102
91
95
---- ---
7
-- - -- -
-- - - - -
3
Table 6-8: Confusion matrix for the object recognition prototype output in which Gaussian noise with
standard deviation of 0.02 has been added to the data. The row headings correspond to the actual identity of
a target. The column headings correspond to the classification that a target received. The numbers that appear
in a specific cell at row i, column j in the table correspond to how many targets of identity i received
classification j.
97
6.5.3 OR performance: closed problem space, loosely controlled orientation, no
noise added
The prototype system performed better than expected on the data set with the loosely
controlled orientation. The classification accuracy was 90.76%, a mere 7% below the
tightly controlled data set.
Shown in table 6-9 is the confusion matrix for the
classification results derived from the data set with the loosely controlled orientation.
This confusion matrix again exhibits a clustering of erroneously used classification labels,
with targets often inappropriately declared to be either object #13, 15, or 16. Further,
while 54 of 585 targets were mislabeled, 31 of these misclassifications occurred for data
originating from either object #29, 32, or 39.
98
Classification
1 2
1
2
3
4
5
6
7
8
9
10
12
3 4
5
6
7 8
9 10 11 12 1314 15 1617 18 19 20 212223 24 25 26 27 28 29 30 31 323334 35 36 37 38 39
15
15
15
14
151
~
I 5
5
15
15;
15
1
1
1 3
4
13
14
1
4
I
1
1
4,
16
115
I5
17
t
18
5I
15
19
3
20
1
11;
21
!15
22
5
I5
23
24
4
25
5
26
2
13
15
27
15
28
29
3
4
1
31
I1
5
2
15
33
15
34
35
36
7
5
30
32
~ ~
15
5
4
!10
37
15
38
15
39
It 2
Table 6-9: Confusion matrix for the object recognition prototype output when recognition is performed
on the loosely controlled orientation data. The row headings correspond to the actual identity of a target.
The column headings correspond to the classification that a target received. The numbers that appear in a
specific cell at row i, column j in the table correspond to how many targets of identity i received classification j.
99
6.5.4 OR performance: open problem space, highly controlled orientation, no noise
added
Up to this point, the object recognition task has only been dealt with in the context of a
closed problem space. That is to say that every target that is presented to the OR system
is a member of a finite set of objects. For each of these objects, a template has been
generated during the system's training phase. The classification task is thus reduced to
selecting the template that best fits the target data.
In the real world, however, object recognition must be performed in an open problem
space.
There is no finite set of which every encountered target must be a member.
Templates may not be built to represent every possible target. Therefore, it is essential to
divide the world up into two groups of objects: those objects that should be exactly
identifiable (known objects) and those objects that should be identified simply as not
being a member of the former group (unknown objects).
For a mine classification
application using the AIS, this division is straightforward.
The world of objects is
divided into mines and not mines. Once the problem space of targets has been divided
into known and unknown objects, data should be collected on all of the known objects
and classification templates formed.
At this stage, however, a complication in the classification process arises.
When
attempting to classify a target, it is no longer enough to simply determine which template
best matches the target data. Under an open problem space, it must also be decided
whether the best matching template matches the target data "closely enough", a
frustratingly subjective criterion. A main advantage to the scoring system used in the
prototype object recognition system is that it somewhat reduces the degree of this
subjectivity. The scoring system assigns each template match a score based upon the
average number of standard deviations the target data differs from the template's means
on class specific criteria.
Therefore, the same subjective "close enough" test can be
applied to the top scoring template match regardless of the object to which the template
100
corresponds. Further, because only one criterion must be adjusted, tuning the system to
achieve the desired false alarm/false negative ratio is greatly simplified.
Histogram of the top scoring template matches
600
400-
2000
0
10
5
15
20
Partial histogram of the second best scoring template matches
600
400L
2000
0
---10
5
15
20
Template matching score (standard deviations)
Figure 6-14: Histograms of the best and second best scoring template matches from an execution of
the object recognition algorithm with a closed problem space, highly controlled transducer/target
orientation, and no additive noise. The data displayed above were produced from an execution of the
object recognition prototype in which 573 of the 585 classifications were correct. Therefore, the top
scoring template match data displayed in the top plot above is a good estimate of the template match scores
that each target's true identity received. The data in the bottom plot, representing the second best template
match score for each classification, is only a partial histogram because the scores actually stretch out to
approximately 50. To better showcase the degree of overlap between the two plots at low levels, however,
the higher-level data points were omitted.
Figure 6-14 above displays histograms of the top scoring template match and the second
best scoring template match for an execution of the closed problem space OR prototype
similar to that discussed in section 6.5.1.
In this execution, the transducer/target
orientation were highly controlled for the data collection, and no noise was added to the
data prior to its use by the prototype. This particular execution correctly identified 573
out of the 585 targets classified. Therefore, the top scoring template match data shown in
figure 6-14 is essentially equivalent to the scoring data for the template that corresponds
to each target's true identity. As can be seen, there is a small degree of overlap between
the top scoring and the second best scoring templates. For the most part, however, the
101
data indicates that even the best scoring template for a false match performs substantially
worse than the worst scoring target for a true match.
This suggests that the object
recognition can performed in an open problem space without significantly degrading
performance.
To test the ability of the object recognition prototype to perform object recognition in an
open problem space, roughly a third of the thirty-nine objects that have been used in the
closed prototype OR assessment so far were randomly selected and removed from the list
of known objects. Two additional objects, a solid pine cylinder of diameter 37.31 mm
(#40) and a pine plank of thickness 19.32 mm and width 88.68 mm (#41), were added to
the third of the objects that were randomly selected to form the unknown object list.
Templates were formed for all of the known objects, and then classifications were
performed on all of the objects, both known and unknown. A decision was made to err in
favor of false alarms instead of false negatives. Therefore, if the top scoring template had
a score below the quite generous level of 4.0 standard deviations, a classification was
made. If the top score, however, exceeded this threshold, the object was labeled as
unknown.
Following this procedure, a classification accuracy of 92.2% was achieved (567 out of
615).
Table 6-10 shows the confusion matrix for the open problem space object
recognition prototype execution. This confusion matrix provides a good deal of insight
into the overlap in the template matching scores between the known and the unknown
objects. As can be seen, accuracy on the known objects for was 94.29% and accuracy on
the unknown objects was 87.69%. For the threshold of 4.0 standard deviations that was
used, the probability of a false alarm for an unknown object was 12.31%. Further, of the
twenty-four unknown targets that were assigned classifications, twenty-three of those
targets had a true identity of either object #24 or object #6. Finally, the probability of a
missed detection (a known object being classified as unknown) is 1.67%.
102
Classification
2 4 5
2
4
5
7
10
Q
12
13
14
15
16
19
20
22
26
7 1011 12131415161920222627282930313233343536373839Unk
15
15
1
14
15
15
213
15
1 14
14L_
i15
_
12
3
1
14
15
I
15!
"0 27
5
28
29
4
1
11
15
15
30
31
32
33
34
35
36
37
38
39
Unk
15
1
;14
15
13
2
15
1
141
1
13 2
15
15
8
171
1
Table 6-10: Confusion matrix for the object recognition prototype output with an
open problem space. The row headings correspond to the actual identity of a target.
The column headings correspond to the classification that a target received. The numbers
that appear in a specific cell at row i, column j in the table correspond to how many
targets of identity i received classification j. Note that the heading Unk represents
unknown objects.
103
6.6 Conclusion
The object recognition prototype system presented in this chapter demonstrates the wealth
of identification information that can be extracted from acoustical backscatter in the low
megahertz range. For the data gathered with the experimental system, the prototype was
able to achieve near 100% accuracy. While the data presented in section 6.2 suggested
that at least the frequency content of the backscattered signal may be highly dependent
upon target/transducer orientation, the OR prototype produced highly accurate results
when using a data set in which this orientation was not tightly controlled. This indicates
that one of the main obstacles to implementing an object recognition system using the
AIS, ensuring consistency in the viewpoint from which data is gathered, may not be as
much of a hurdle as previously suspected. Further, as demonstrated in section 6.5.4, the
template scoring showed a good deal of separation between the template corresponding to
the target's identity and other templates.
Because of the relatively high degree of
separation amongst the objects in the identification scoring space, implementing a reliable
open problem space system for real world mine identification applications seems
plausible.
104
7 Object recognition implementation suggestions
7.1 Introduction
This chapter briefly presents suggestions for the implementation of an object recognition
system using the AIS. In section 7.2, sample images taken using a precursor of the AIS
are presented. Section 7.3 then discusses methods by which the presence of a target in an
image may be noted and how a region of interest for that target may be extracted. In
section 7.4, the use of an ensemble of time series originating from many detector
elements to perform more robust object recognition is explored. Incorporation of imagelevel classification features into an object recognition algorithm is then dealt with in
section 7.5.
Finally, section 7.6 investigates how classification features could be
provided with weights that are individualized for each template, thereby allowing those
features with higher discriminatory power for a particular object to be emphasized.
7.2 Sample images
As mentioned previously, technical difficulties precluded working with the AIS during
this thesis. Recently, however, it has become possible to acquire crude pictures using a
full array on a precursor of the AIS. While examining these images, it is important to
keep two points in mind. First, only the depth and magnitude of the strongest return
value at a particular array location can currently be recorded. Second, many system-level
imaging issues remain to be resolved such as how to reduce acoustical lens reflectivity
and how to determine the proper settings for the AIM. As these matters are resolved,
image quality will certainly improve. Despite these caveats, inspection of sample images
still provides clues as to what an image from the AIS will be like. Shown below in figure
7-1 are two such sample images.
105
Image of object #8: A copper pipe w ith outer diameter 53.98 mm, thickness 1.02 rrm, and approx. length 225 rrm
20
40
60
20
40
60
80
100
120
Image of object #33: An aluminum pipe w ith outer diameter 76.23 mm, thickness 6.03 mrn, and approx. length 400 m
S20
S40
60
20
40
60
80
100
120
Array elements
Figure 7-1: Sample images taken from a precursor of the Acoustical Imaging System (AIS). To allow
the figure to be more compact while not distorting the aspect ratio of the images, both images show only the
data from the top half of the detector array since there was no data located in the bottom halves. The top
image is from object #8, a copper pipe with outer diameter 53.98 mm, thickness 1.02 mm, and length of
approximately 225 mm. The bottom image is from object #33, an aluminum pipe with outer diameter of
76.23 mm, thickness 6.03 mm, and length of approximately 400 mm. The images shown represent the
maximum reflection intensity encountered at each array element, regardless of the depth of that maximum.
Lighter shades of gray represent more intense reflections. In both cases, image data was acquired for
approximately one second (ten images at the data acquisition rate used). Both of the images shown above
were the first recorded image. There was a slight degree of background speckle that varied from image to
image, however, the target images were very consistent. Note that a median filter has been applied to the
images shown above to eliminate some of the speckle.
In both images, there is a central bright spot. Further, the targets appear to stretch further
horizontally than they do vertically. Also, the long axes of the targets appear to have a
slight clockwise rotation with respect to horizontal.
Finally, both images contain
elliptical scatter patterns that are centered about the bright spot. Because the central
bright spot and the elliptical scatter patterns are both likely results of the transducer beam
used, these features are consistent with the way that the pipes were presented to the AIS.
The apparent length of the pipes, however, is not consistent with the pipes used.
106
Both pipes were located approximately a meter from the AIS and were centered
horizontally and slightly offset vertically with respect to the imager. The field of view of
the AIS at this range is approximately 250 mm by 250 mm, thus each pixel represents
approximately 2 mm by 2 mm area. Therefore, the length of the copper pipe (top image)
is such that it should just about fill the complete horizontal field of view, and the length
of the aluminum pipe is such that it should easily span the image horizontally. Neither of
the pipes, however, appears to stretch from edge to edge of the image. Instead, the copper
and aluminum pipes appear to stretch approximately 80 mm and 120 mm respectively.
Further, by examining the region outside the central bright spot, the aluminum pipe
appears to be approximately ten pixels (or 20 mm) in diameter and the copper pipe a bit
thinner. Because a surface must be roughly perpendicular to the imager's surface normal
to be observed, this diminished apparent diameter was expected. It is not known why the
apparent lengths of the pipes are shorter than the actual lengths.
The region of
insonification had a diameter of approximately 360 mm at the range of the pipes.
Therefore, the insonified region covered the entire field of view.
It is hypothesized,
however, that the apparent length of the aluminum pipe is greater than the copper pipe
due to the greater outer diameter of the aluminum pipe. It is believed that aluminum's
material properties may also be a factor. Further, it is hypothesized that the greater actual
length of the aluminum pipe is not a factor in this phenomenon.
7.3 Image segmentation
In real-world object recognition tasks, an important first step is referred to as image
segmentation.
The system must determine whether there is a target that should be
classified in the current image. If so, the spatial extent of that target must be identified.
In high signal-to-noise environments, this task is fairly straightforward: a specular
reflection can be identified by looking for data that exceed a threshold. To identify the
extent of the target, all of the contiguous pixels that contain a reflection with strength
greater than some other (likely lower) threshold could be grouped.
107
In lower signal-to-noise environments, the image segmentation task becomes more
complicated.
Because noise levels may rise to the level of even stronger specular
reflections, it is not sufficient to naively apply a threshold to all of the data to search for
the presence of a target. As can clearly be seen in figure 7-1, the image contains a great
deal of order.
It is suggested that a detection metric which takes into account the
connectivity of the high intensity pixels be used to determine whether or not a target is
present. Such an approach could be as simple as requiring that the signal strength in
some prescribed region be above a threshold, as done by Tuovila, Nelson, and Smith [13].
A more complicated detection method that incorporates knowledge of the noise
environment and the targets to be identified could also be developed. Czerwinski, Jones,
and O'Brien present an example line and boundary detection system for two-dimensional
speckle images and discuss the tradeoffs involved extensively in [14].
7.4 Use of an ensemble of time series to improve object recognition
Following target identification, the region of interest selected from the image, assuming
there was a region of interest selected, will undergo classification. Due to limitations in
the nature of the data that could be acquired, the work in this thesis has focused on a
prototype classification algorithm that makes use of only a single acoustical time series.
Many such time series, however, make up an image. Using only a single time series,
therefore, neglects a great deal of the information available. Further, the prototype made
no use of such image-level characteristics as shapes and relative orientations.
This
section will suggest approaches that may allow multiple time series to be used together to
provide increased discriminatory power over a single time series. The next section will
briefly discuss the incorporation of image-level characteristics.
The first step in the use of multiple time series for target classification is to precisely
determine the degree to which the signals acquired by the AIS are viewpoint dependent.
This issue was first presented in section 6.2. In that section, it was shown that the data
acquired by the experimental system was dependent upon the viewpoint, particularly with
regard to vertical translations of the transducer with respect to the target. The reasons for
108
this viewpoint dependence, however, could not be precisely determined.
Neither
viewpoint dependent alterations in the transfer function encountered by the acoustical
signal nor spatial variations in the transmitted acoustical pulse could be ruled out.
Therefore, it will be necessary to perform a similar set of experiments with the AIS to
determine the degree to which the data it acquires is viewpoint dependent.
If it is found that the data from the AIS exhibits little or no viewpoint dependence, then
all pixels in the region of interest should be good candidates for use in time series object
recognition. If on the other hand it is found that the data from the AIS exhibits viewpoint
dependence similar to that shown by the experimental system, then only a subset of those
pixels in the region of interest should be used in time series object recognition.
The
subset of array elements whose time series data is used should exhibit two properties: 1)
the viewpoint of all of the elements is similar in nature and 2) that subset of pixels can be
reliably located and extracted each time a target is encountered.
For the pipes and
cylinders used in this thesis, such a subset could be easily extracted. As discussed in
section 6.2, due to the symmetry of the pipes and cylinders, the viewpoint dependence of
the received signal is also symmetric about the target's long axis. By first identifying the
directional orientation of the object's long axis and then searching for this point of
symmetry, a line of array elements could be identified from which to take the time series
data.
Once a set of array elements has been identified for use in time series object recognition,
the time series data from these elements must be combined in some way to produce a
classification. Two relatively simple approaches seem to merit exploration. In the first,
the data from each of the elements is combined to form an aggregate signal. This task
may be accomplished using an algorithm as simple as aligning the start of each signal and
then averaging the return at every point. The object recognition algorithm should then be
performed on the aggregate signal.
The second approach would first run the object
recognition algorithm on each of the elements' time series. The results of each of these
classifications would then be pooled to form a final classification. This pooling could
109
take place using a simple voting algorithm in which the known object receiving a
plurality is selected. Alternately, a more complicated algorithm that takes into account
the similarity between objects could be used. Under this more complicated scheme, a
similarity score between each time series and the templates would be computed. The
similarities for each template over all time series would then be summed, and the
template that was found to be most similar selected. The exact classification method used
should depend upon the amount of computation required and the performance achieved
by the various methods.
7.5 Image-level object recognition
While to a human observer interpretation of the types of images shown in figure 7-1 is
very difficult, it should be possible to extract meaning from such images and to use this
meaning in the classification process. For simple objects like those shown in figure 7-1,
such features as apparent target diameter and length can be extracted. Target shape is
another important feature that should be exploited. For more complicated targets that are
composed of interconnected parts, it should also be possible to get a sense not only of the
shape of each part, but the relative orientation of the parts. A precise statement of the
types of features that can be extracted at the image level and the usefulness of those
features cannot be made until more experimentation has been performed.
While
difficulties in using this type of data, such as how to deal with the inherent viewpoint
dependence in an image, are certain to be encountered, addition of image-level
information to time series classification algorithms seems promising.
7.6 Weighting of classification features
While the performance of the classification algorithm developed in chapter 6 was shown
to be quite good in section 6.5, it is believed that this performance could be improved
even more by the addition of another step to the training process. As described in section
6.4, for each known object a template is created during the training process that contains
the mean score and standard deviation of scores for class specific criteria as computed on
the training data. At present, each of the template's criteria is given equal weight during
110
the classification process. The criteria, however, have individual discriminatory power
that differs from object to object. By applying template specific weights to each of the
criteria, the overall classification performance of the system should be able to be
improved. This weighting could be accomplished in one of two ways. First, a weighting
function could be developed that assigns weights based upon the discriminatory power of
each criterion.
Second, after the criteria have been computed, a multidimensional
optimization algorithm could be used to compute the weights that would result in the
fewest incorrect responses. Both approaches would require that the weights for each
criterion in a template be normalized so that the sum of the weights applied equals the
same value for all templates. Further, for both approaches, if sufficient data is available,
the training data should be split into two subsets. The first subset should then be used for
template creation and the second subset for criterion weight determination.
7.7 Acoustical Imaging System (AIS) design suggestions
Previously, the surface detection algorithm was carried out by specialized hardware in the
Image Interface Board. Only the plane and magnitude of the surface at each point in the
detector array were passed back to the DSP Image Processing Board. In chapter 4, an
implementation of the surface detection algorithm was developed that could perform the
algorithm under worst case conditions using under half of the computational and
communications power available from the DSP chips.
By showing that the surface
detection algorithm could be managed by the digital signal processing chips of the DSP
Image Processing Board from both a computation and communication standpoint, these
highly versatile programmable chips can now be given access to vast amounts of data.
Current plans for the AIS, however, intend to present only the magnitude of the acoustical
return at each of eighty planes to the DSP Image Processing Board. This approach would
severely limit the object recognition power that could be achieved by the AIS. Therefore,
it is suggested that the capability be added to the AIS for the DSP Image Processing
Board to request certain parts of data from the Image Interface Board. In this way, a
background task on a DSP could identify a region of interest in the magnitude data for an
111
acoustical image. It could then request to have all 320 of the samples that were used to
generate the 80 magnitude values for each detection element that is in the region of
interest shipped back to it. This approach would allow the communications and memory
requirements of an object recognition algorithm to be kept in check while at the same
time maintaining a high degree of discriminatory power.
Finally, it is also recommended that if object recognition is to be attempted using the AIS,
use of shorter duration acoustical pulses should be explored. Shorter pulses will allow for
finer-grained determination of structural features, as more objects will be able to be
considered thick-shelled. Further, because shorter pulse duration translates into higher
bandwidth transmission, more frequency spectrum-based information will be available.
7.8 Conclusion
This chapter attempted to briefly provide suggestions for the implementation of an object
recognition algorithm in the AIS. Crude images taken from a precursor to the AIS are
shown in section 7.2.
Section 7.3 then discusses how regions of interest that may
represent known objects can be located within such images.
Following this image
segmentation, detector array elements must be selected and their time series analyzed.
Time series classification using data from an ensemble of array elements is discussed in
section 7.4. Image-level data will also be available from the AIS; incorporation of this
type of data into a classification algorithm is the subject of section 7.5. Section 7.6
explores methods by which the classification criteria may be weighted to emphasize those
criteria with more discriminatory power. Because much work remains to be done on the
AIS, these suggestions could not always be concrete. This state of the AIS, however,
afforded the additional opportunity to suggest simple design changes that could increase
the object recognition power of the system. In section 7.7, it is suggested that the DSP
Image Processing Board should be given the ability to request additional data from the
Image Interface Board. This data would contain all of the samples within a specified
region of detector elements and could then be used by an object recognition algorithm.
Further, section 7.7 recommends exploration of the use of shorter acoustical pulses than
112
currently planned for the AIS. It is hoped that the suggestions in this chapter may serve
as a starting point for the development of a more complex object recognition system to be
used by the AIS that builds off of the work in this thesis.
113
8 Conclusion
This thesis deals with a subset of the data analysis tasks that could be attacked by the
Acoustical Imaging System. After presenting a brief overview of the system hardware
and software, two main tasks are addressed: surface detection and object recognition.
Chapters 2 and 3 briefly present the hardware and the software components of the
Acoustical Imaging System. These are intended to familiarize the reader with the system
being developed at Lockheed Martin, as well as serve as background for various
implementation issues that are discussed later.
It is shown in chapter 4 that the surface detection algorithm may be reliably performed by
software running on the DSP Image Processing Board. The chapter starts by presenting a
brief justification for the algorithm selected. It then presents the preprocessing necessary
to prepare the acoustical return data for the surface detection algorithm. The algorithm is
then presented in depth, and is followed by a proof of its correct function. Finally, the
computational and communications burden caused by the surface detection algorithm are
analyzed given a few weak assumptions and shown to consume less than half of the
processing power of the DSP Image Processing Board.
Chapters 5 and 6 showed that on an experimental system that gathered information
similar to that available to the AIS, highly reliable object recognition could be performed
on a set of test objects consisting of solid cylinders, pipes, and plates of varying materials.
These objects are relatively simple, however, their shapes were shown to be
representative of sea mines currently in use. While they lack the internal structure of
many mines, this omission is both a blessing and a curse. This internal structure will
make the acoustical returns from mines more complicated.
An intelligent object
recognition algorithm that makes use of this structure as well as its position in the image,
however, could extract a great deal of meaning from it. Further, the homogeneity of the
test objects used created a relatively complex object recognition environment, despite the
relative simplicity of the targets. Therefore, it is firmly believed that with time and effort,
114
an effective sea mine object recognition system could be developed for the Acoustical
Imaging System.
Further, it is believed that this system could be a fully automated
background task that would occur in the spare processing cycles left open by the surface
detection algorithm.
Finally, implementation suggestions for an object recognition system on the AIS are
presented in chapter 7. To begin the discussion, sample images from a precursor of the
AIS are presented. Segmentation of such images to extract regions of interest that will be
used in object recognition is then touched upon. Next, two simple ways in which the use
of many time series may improve object recognition over a single time series are
presented. Additionally, incorporation of image-level features into the object recognition
process and object-specific weighting of classification features are mentioned. Finally,
simple issues in the design of the AIS that may increase the object recognition power of
the system are addressed. The issues raised in chapter 7 are but a few of the ways in
which the work in this thesis could be built upon.
115
Appendix
A. List of Acr onyms
2DUS
Two-dimensional Ultrasound
3D
Three-dimensional
3DUS
Three-dimensional Ultrasound
A/D
Analog-to-Digital Converter
AIS
Acoustical Imaging System
AIM
Acoustical Imaging Module
D/A
Digital-to-Analog Converter
DAQ DSP
Data Acquisition Digital Signal Processor
DFT
Discrete Fourier Transform
DIS DSP
Video Display Digital Signal Processor
DMA
Direct Memory Access
DSP
Digital Signal Processor
FFT
Fast Fourier Transform (an algorithm to perform the DFT)
FIFO
First-In First-Out Queue
IIB
Image Interface Board
LCD
Liquid Crystal Display
LMIRIS
Lockheed Martin IR Imaging Systems
LTI
Linear Time Invariant
MSB
Most Significant Bit
NOP
Assembly language instruction indicating that no operation is to be
performed on the current clock cycle
OR
Object Recognition
PFP
Potential First Peak
PKD DSP
Peak Detection Digital Signal Processor
PSD
Power Spectral Density
PVC
Polyvinylchloride
ROC
Receiver Operating Curve
ROIC
Read-Out Integrated Circuit
SHARC
Super Harvard Architecture
SNR
Signal-to-Noise Ratio
THA
Transducer Hybrid Assembly
TRIC
Transmit-receive Integrated Circuit
116
B. Source code of prototype peak detection algorithm
The following is the source code for the prototype peak detection algorithm developed. It
was written using Microsoft Visual C++, version 1.5.
coordination function is included.
Note that no image-level
Because the computational burden of calling the
frame-level processing algorithm approximately ten times for an image is insignificant
compared to the cost of running the frame-level processing algorithm, it is felt that such
an omission is justified.
//
// File Peak051.c test version 0.5.1 of
the peak detection algorithm.
//
// Author: Daniel C. Letzler
// Started: July 9, 1998
// Last Updated: July 21, 1998
//
// Sonoelectronics Project
// Lockheed Martin IR Imaging Systems -
// The header files
//
to make use of.
Lexington, MA
for the precompiled code that
I will want
#include <stdio.h>
#include <stdlib.h>
// Some definitions for use during the body of the code.
#define MAX 1
#define MIN -1
#define NEITHER 0
#define SIGTHRESH 400
#define NOISETHRESH 200
#define FRAMES 1
//
//
//
//
//
//
How many pixels from the detection array will
be handled by a DSP.
How many total planes of data do we have.
How many planes of US data are there to
a frame.
How many frames of Data are there.
#define NORETURN 160
//
Value used by system when there was no plane
#define ARRAYSIZE 8
#define PLANES 5
#define FRAMESIZE 5
// at a pixel.
// The "Potential First Peaks" data structure.
typedef struct{
char AboveThresh;
char PeakFound;
117
unsigned char plane;
mag;
int
PFP;
// Globals variables -
keep it simple, don't worry about elegance.
// The G's are appended on the beginning because these are global
// variables and I use their names elsewhere without the G's
The current entries to pass in.
GFrameArray[ARRAYSIZE]; //
int
The Potential First Peaks Array
PFP GPFPArray[ARRAYSIZE]; //
// This function will coordinate the processing of a block of
// frame data.
// Tested 07/23/98 //
*FrameArray,
Frame, int
void ProcessFrame(int FrameSize, int
ArraySize, PFP *PFPArray)
int
{
int i, ii;
int
*PixelForFrame;
Loop through each of the pixels that we have.
//
// Pass grab the proper information and pass it to
//
ProcessPixelForFrame for processing.
for(i = 0; i < ArraySize; i++)
{
PixelForFrame = FrameArray;
// Find out if the data at the pixel has
// by the beginning of the Frame.
if(PFPArray->AboveThresh)
been above the SigThresh
{
//
//
//
Data has been above the threshold before the
Yes!
current frame.
Find out if we have already found a peak.
if(PFPArray->PeakFound)
{
// Yes so quit processing.
FrameArray += FrameSize; // Increment the FrameArray pointer
// to look at the next pixel in the
FrameArray.
//
PFPArray++; // Move the PFPArray to the next pixel also.
continue;
}
else
{
//
No, so process pixel from start of frame.
for(ii
= 0;
ii < FrameSize;
ii++)
{
// Is Curr greater than our Potential Peak?
if(*PixelForFrame > PFPArray->mag)
{
//
Yes, it meets all of the criteria to update the PotPeak
//
Update the PFPArray entry.
PFPArray->mag = *PixelForFrame;
PFPArray->plane = Frame*FrameSize + ii;
118
}
//
Are we below the PotPeak by enough to call it a peak?
else if((PFPArray->mag
-
(*PixelForFrame&4095))
> NOISETHRESH)
{
The PotPeak should now be marked as a peak!
// Yes!
PFPArray->PeakFound = 1;
FrameArray += FrameSize; // Increment the FrameArray pointer
PFPArray++; // Move the PFPArray to the next pixel also.
Quit: our processing here is done
break; //
}
PixelForFrame++;
} //
//
Point to the next pixel for the block;
End of the for loop
} // End of the else statement that meant we hadn't found a peak.
} // End of "has data been above the SigThresh before the frame" if
else
{
//
We have not been above SigThresh by the beginning of the
// See if we go over by the end;
if(!(PixelForFrame[FrameSize-l]&0x8000))
frame
{
// Yes, so process the frame, starting where we go over.
// Also, mark that we went over.
PFPArray->AboveThresh = 1;
ii
= 0;
while(*PixelForFrame&0x8000)
{
PixelForFrame++;
}
// Now we are located at the first location where the
// data is above the signal threshold.
// Note that we don't have to initialize ii for the for loop
// because the while statement above put it in the proper place.
for(; ii < FrameSize; ii++)
{
// Is Curr greater than our Potential Peak?
if(*PixelForFrame > PFPArray->mag)
{
// Yes, it meets all of the criteria to update the PotPeak
// Update the PFPArray entry.
PFPArray->mag = *PixelForFrame;
PFPArray->plane = Frame*FrameSize + ii;
} // End of the LocalMax
// Are we below the PotPeak by enough to call it a peak?
else if((PFPArray->mag - (*PixelForFrame&4095)) > NOISETHRESH)
{
The PotPeak should now be marked as a peak!
// Yes!
PFPArray->PeakFound = 1;
FrameArray += FrameSize; // Increment the FrameArray pointer
PFPArray++; // Move the PFPArray to the next pixel also.
break;
} // End of the LocalMin statement
// Make us point to the next pixel
PixelForFrame++;
} // End of the for loop
} // end of if statement that revealed we pass above thresh by end
119
else
{
//
okay, we don't pass above SigThresh before the end of frame
// just quit processing
FrameArray += FrameSize; // Increment the FrameArray pointer to
PFPArray++; // Move the PFPArray to the next pixel also.
continue;
}
}// End of the else saying we are below SigThresh for all points
// the frame.
FrameArray += FrameSize; // Increm FrameArray pointer to next pixel
PFPArray++; // Move the PFPArray to the next pixel also.
}
C. Assembly code listing associated with the prototype and computational burden
calculations
The assembly code generated from the prototype peak detection algorithm whose code is
listed in appendix B is shown below. This assembly code was generated on the Analog
Devices g21k compiler. The compiler was provided with the command line switches -S
and -03, which instruct the compiler to generate an assembly code listing only and to
perform the maximum amount of optimizations, respectively.
Analog Devices ADSP210x0
.segment /pm seg-pmco; .file "peak051.c";
.segment /dm seg-dmda;
.gcccompiled;
.endseg;
.segment /pm seg-pmco;
.endseg;
_ProcessFrame;
.global
ProcessFrame:
FUNCTION PROLOGUE: ProcessFrame
rtrts protocol, params in registers, DM stack, doubles are floats
modify(i7, -10);
saving registers:
dm(-2, i6) =r3;
dm(-3, i6)=r5;
dm(-4, i6)=r7;
dm(-5, i6) =r9;
dm (-6, i6) =r11;
dm(-7,i6)=r13;
r2=iO;
dm(-8, i6) =r2;
r2=il;
dm(-9, i6) =r2;
r2=i2;
dm (-10 ,i6 )=r2;
r2=i3;
dm(-l1, i6) =r2;
120
.val
.def end-prologue;
.;
.scl
i2=r12;
r11=dm(1,i6);
rll=pass r11;
if le jump (pc,
i1=dm(2,i6)
r2=r8;
r7=4095;
r3=200;
i3=il;
rO=i;
r13=r2*r4
r5=32768;
_L$3)
(DB) ;
(ssi) ,modify(i3,m6)
_L$28:
lcntr = r1l,
do
r2=dm(il,m5)
r2=pass r2;
if eq jump (pc,
i0=i2;
nop;
r2=dm(i3,m5)
r2=pass r2;
_L$3-1 until lce;
(DB);
_L$5)
if ne jump (pc, _L $15)
comp(r2,r4);
nop;
if ge jump (pc, _L $15)
(DB);
(DB);
r9=r2;
nop;
rl=pass r13,i4=il;
modify(i4,2);
_L$14
!move
:
r2=dm(iO,m5)
r8=dm(1, i4)
comp(r2,r8);
if gt jump (pc,
_L$11)
(DB);
r12=r2 and r7;
r12=r8-r12;
comp(r12,r3);
(DB); nop;
if le jump (pc, _L$12)
jump (pc, _L$31) (DB);
modify(i3,4);
m4=r4;
_L$11:
dm(i4,m5)=rl;
dm(1,i 4)=r2;
_L$12:
r9=r9+1;
comp(r9,r4);
if lt jump (pc,
_L$14)
(DB);
modify(iO,m6);
rl=r1+1;
jump
(pc,
_L$32)
(DB);
modify(il,4)
m4=r4;
_L$5:
r2=r4-1;
121
nop;
109;
.endef;
m4=r2;
r2=dm(m4,i2);
r8=r2 and r5;
if ne jump (pc,
dm(il,m5)=rO;
r2=dm(i2,m5);
r2=r2 and r5;
if eq jump (pc,
r9=r8;
nop;
_L$15)
(DB); nop;
_L$18)
(DB);
_L$19)
(DB);
nop;
_L$19:
modify(i0,m6)
r2=dm(iO,m5);
r2=r2 and r5;
if ne jump (pc,
r9=r9+1;
nop;
_L$18:
comp(r9,r4);
if ge jump (pc, _L$15)
rl=r9+r13,i4=il;
modify(i4,2);
(DB); nop; nop;
_L$26:
r2=dm(iO,m5);
r8=dm(1,i4);
comp(r2,r8);
if gt jump (pc,
r12=r2 and r7;
r12=r8-r12;
_L$23)
(DB);
_L$24)
(DB) ;
comp(r12,r3);
if le jump
m4=r4;
(pc,
nop;
modify(i3,4);
_L$31:
modify(il,4);
jump (pc, _L$15)
dm (- 1,i4)
=r0;
modify(i2,m4);
(DB) ;
_L$23:
dm(i4,mS) =rl;
dm(1,i4)=r2;
L$24:
r9=r9+1;
comp(r9,r4);
if lt jump (pc,
modify(i0,m6)
_L$26)
(DB);
rl=rl+1;
L$15:
modify(i1,4);
m4=r4;
_L$32:
modify(i3,4);
modify(i2,m4);
nop;
_L$29:
_L$3:
FUNCTION EPILOGUE:
122
i12=dm (-1, i6)
r3=dm(-2,i6);
r5=dm(-3, i6);
r7=dm(-4,i6);
r9=dm(-5,i6);
rll=dm(-6, i6);
r13=dm(-7,i6);
iO=dm(-8, i6);
i1=dm ( -9 , i6 ) ;
i2=dm (-10 , i6 );
i3=dm(-11,i6);
jump
(m14,il2) (DB);
i7=i6;
i6=dm(O, i6);
.endseg;
.segment /dm segdmda;
.global _GFrameArray; .var
.global _GPFPArray; .var
.endseg;
_GFrameArray[7];
_GPFPArray[32];
The computational burden calculations were made using the program Burd05 1.c. This is
a C program which was written for the express purpose of calculating the computational
burden of the peak detection algorithm with varying parameters.
This program was
shown to correctly compute the computational burden of processing various situations by
checking it against manual calculations. The program groups the control strands of the
peak detection algorithm based upon execution time of the strands. Each group is then
assigned a value based upon the most expensive strand in the group, as determined by
examining the assembly code listing above. Therefore the computational estimates are
upper-bounds on the computational burden that will be incurred.
The computational burden calculations performed are shown on the next page.
123
OverheadNoProcessing =
OverheadProcessing =
18;
30;
WhileLoopCost = 6;
ExaminationCost= 15;
UFC =
OverheadNoProcessing;
PFC =
OverheadProcessing;
WhC = W hileLoops
ExC
=
e
WhileLoopCost;
EmaminedPixelse ExaminationCost;
PixelFrameCost=
{
UFC, if pixel skipped;
PFC + WhC + ExC, if pixel examined;
PixelFrameCost;
PixelCost =
frames
ImageCost = I
PixelCost;
pixels
ComputationalBurden= ImageCost * AcousticUpdateRate;
124
D. Listing of test objects used
Object #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Material
Dimensions
Shape
114.25 mm diameter, 5.98 mm thick, approx. 380 mm long
Pipe
89.10 mm diameter, 5.62 mm thick, approx. 350 mm long
Pipe
60.42 mm diameter, 4.21 mm thick, approx. 350 mm long
Pipe
PVC
48.12 mm diameter, 3.77 mm thick, approx. 275 mm long
Pipe
PVC
42.17 mm diameter, 3.64 mm thick, approx. 300 mm long
Pipe
PVC
33. 40 mm diameter, 3.44 mm thick, approx. 325 mm long
Pipe
PVC
26.93 mm diameter, 2.95 mm thick, approx. 350 mm long
Pipe
PVC
53.98 mm diameter, 1.02 mm thick, approx. 225 mm long
Pipe
Copper
41.10 mm diameter, 1.08 mm thick, approx. 275 mm long
Pipe
Copper
34.82 mm diameter, 0.95 mm thick, approx. 300 mm long
Pipe
Copper
28.43 mm diameter, 1.20 mm thick, approx. 275 mm long
Pipe
Copper
22.20 mm diameter, 0.73 mm thick, approx. 275 mm long
Pipe
Copper
16.04 mm diameter, 0.66 mm thick, approx. 325 mm long
Pipe
Copper
14.85 mm diameter, 0.50 mm thick, approx. 225 mm long
Pipe
Brass
15.96 mm diameter, 0.80 mm thick, approx. 225 mm long
Pipe
Brass
15.86 mm diameter, 1.62 mm thick, approx. 275 mm long
Pipe
Brass
9.51 mm diameter, 1.75 mm thick, approx. 325 mm long
Pipe
Brass
6.33 mm diameter, 2.12 mm thick, approx. 375 mm long
Pipe
Brass
mm diameter, 1.71 mm thick, approx. 210 mm long
35.06
Pipe
Brass
63.43 mm diameter, 3.18 mm thick, approx. 375 mm long
Pipe
Brass
12.66 mm diameter, approx. 260 mm long
Cylinder
Brass
14.26 mm diameter, approx. 250 mm long
Cylinder
Brass
25.43 mm diameter, approx. 450 mm long
Cylinder
Brass
50.81 mm diameter, approx. 200 mm long
Cylinder
Brass
6.35 mm thick, approx. 125 mm wide, approx. 225 mm long
Plate
Brass
22.16 mm diameter, 1.60 mm thick, approx. 275 mm long
Pipe
Aluminum
31.71 mm diameter, 2.05 mm thick, approx. 250 mm long
Pipe
Aluminum
34.86 mm diameter, 1.42 mm thick, approx. 325 mm long
Pipe
Aluminum
38.06 mm diameter, 6.43 mm thick, approx. 550 mm long
Pipe
Aluminum
41.26 mm diameter, 0.86 mm thick, approx. 125 mm long
Pipe
Aluminum
Pipe 50.80 mm diameter, 6.40 mm thick, approx. 325 mm long
Aluminum
76.19 mm diameter, 24.78 mm thick, approx. 160 mm long
Pipe
Aluminum
76.23 mm diameter, 6.03 mm thick, approx. 400 mm long
Pipe
Aluminum
76.31 mm diameter, 12.75 mm thick, approx. 475 mm long
Pipe
Aluminum
50.77 mm diameter, approx. 270 mm long
Aluminum Cylinder
76.11 mm diameter, approx. 250 mm long
Aluminum Cylinder
6.50 mm thick, approx. 150 mm wide, approx. 250 mm long
Plate
Aluminum
25.60 mm diameter, approx. 425 mm long
Steel - 303 Cylinder
25.51 mm diameter, approx. 240 mm long
Steel - 316 Cylinder
37.31 mm diameter, approx. 390 mm long
Wood - Pine Cylinder
19.32 mm thick, 88.68 mm wide, approx. 410 mm long
Wood - Pine Plank
PVC
PVC
125
E. Equation-based statement of each of the discriminatory tests used
Listing of operators:
As a convention, series will appear in bold and single values will appear normal. Further, an operation that
is performed on a series, unless it is defined as operating on a series and returning a specific value, will
perform the operation on every member of the series. So, for example, x + 2 would add 2 to every element
in the series x.
cony - performs convolution (example: conv(x,y);)
max - finds maximum value in a series (example: max([l 5 3]) = 5)
abs - returns the absolute value of a value (example: abs(- 1) = 1)
loc - returns the index of a value in a list (example: find([1 5 3] = 5) = 2)
( - index into a series (example: y(2) = 2nd value in the series y)
- indicates that the elements between the first and the last indices shown are selected from a series
(example: y(start:stop) selects every element of y with an index in the interval spanned by start
and stop, inclusive)
sum - sums the elements of a series (example: sum([1 5 3]) = 9)
A - exponentiate (example: yA2 squares every element in y)
psd - calculates the power spectral density in the manner described in section 6.3.3 for the input series
(example: [Pxx, F] = psd(y(start:stop)) computes the power spectral density of the data in series y
with indices in the interval spanned by start and stop, inclusive; Pxx and F are series of the same
length - in F is stored every frequency for which a power spectral density value was computed
and in the same location in Pxx is stored the PSD value that corresponds to that frequency)
Background calculations:
USEC = the number of samples in 1 psec
Input acoustic recording = x;
Matched filter = m;
y = conv(m,x);
yabs = abs(y);
ymax = max(yabs);
yfirst = find(yabs = ymax)
BEFORE = 20;
AFTER = 479;
Thick-shelled object classification criteria:
Material determination
start = yfirst + 1.4*USEC;
stop = yfirst + 6*USEC;
g = yabs(start:stop);
gmax = max(g);
ysecond = yfirst + find(g = gmax);
start = ysecond + (ysecond - yfirst) - 1*USEC;
stop = ysecond + (ysecond - yfirst) + 1*USEC;
g = yabs(start:stop);
gmax = max(g);
ythird = ysecond + (ysecond - yfirst) + find(g = gmax);
126
secondval = yabs(ysecond);
thirdval = yabs(ythird);
ratio = thirdval/secondval;
Then just look up what material is indicated by that ratio!
Pipe shell thickness estimate
c = speed of sound in material which the material determination algorithm decided;
samples = ysecond - yfirst;
time = samples/USEC;
pipe shell thickness = time*c/2;
Pipe inner diameter estimate
start = yfirst + 15*USEC;
stop = the end of the data series;
g = yabs(start:stop);
gmax = max(g);
maxloc = start + find(g = gmax);
start = maxloc - samples - 1*USEC;
stop = maxloc - samples + 1*USEC;
g = yabs(start:stop);
gmax = max(g);
checkloc = start + find(g = gmax);
maxval = yabs(maxloc);
checkval = yabs(checkloc);
if (maxval/chackval is a valid ratio for the present material)
then we are in a plate and we should quit
else
we are not in a plate and we have found the front edge of the back wall - keep going
cwater = speed of sound in water;
innersamples = maxloc - secondloc;
time = innersamples/USEC;
inner diameter = time*c/2;
"Transfer function" estimate about the first reflection
APxx = PSD of the acoustic pulse;
start = yfirst - BEFORE;
stop = yfirst + AFTER;
h = x(start:stop);
[Pxx, F] = psd(h);
Pxx
= Pxx - APxx;
Let TemPxx be a region of a PSD stored in a template for an object.
For every template, do the following:
TotSSQ = 0;
For every region of a PSD stored in the current template do the following:
startFreq = lowest frequency in this region for the template;
stopFreq = highest frequency in this region for the template;
start = find(F = startFreq);
stop = find(F = stopFreq);
127
maxTemp = max(TemPxx);
maxDat = max(Pxx(start:stop));
Pxx = Pxx + maxTemp - maxDat;
Error = Pxx - TemPxx;
SqError = ErrorA2;
SSQ = sum(SqError);
TotSSQ = TotSSQ + SSQ;
End of the tasks to be completed for every region of a PSD stored in a template.
"Transfer function" estimate criteria score for the current template = TotSSQ;
End of tasks to be completed for every template.
Thin-shelled object classification criteria:
Signal strength ratios
ysquared = yA2;
start = yfirst + 2*USEC;
stop = yfirst + 3*USEC;
SSQI = sum(ysquared(start:stop));
start = stop;
stop = yfirst + 4*USEC;
SSQ2 = sum(ysquared(start:stop));
start = stop;
stop = yfirst + 5*USEC;
SSQ3 = sum(ysquared(start:stop));
start = stop;
stop = yfirst + 6*USEC;
SSQ4
=
sum(ysquared(start:stop));
Ratio1 = SSQ1/SSQ2;
Ratio2 = SSQ2/SSQ4;
Ratio3 = SSQ3/SSQ4;
Use these three ratios as the classification criteria!
Length of time from the first reflection to the strongest reflection from the back wall
start = yfirst + 3*USEC;
stop = end of data;
g = yabs(start:stop);
gmax = max(g);
maxloc = start + find(g = gmax);
time = (maxloc - yfirst)/USEC;
"Transfer function" estimate about the first reflection
This criterion is the same as for thick-shelled objects.
Cylindrical object classification criteria
Length of time from first reflection to reflection from the back edge of the cylinder
start = yfirst + 4*USEC;
stop = end of data;
128
g = yabs(start:stop);
gmax = max(g);
maxloc = start + find(g = gmax);
time = (maxloc - yfirst)/USEC;
"Transfer function" estimate about the first reflection from the back edge of the cylinder
This criterion is the same as for thick-shelled objects except the first six lines should be replaced with the
following:
APxx = PSD of the acoustic pulse;
start = yfirst + 4*USEC;
stop = end of data;
g = yabs(start:stop);
gmax = max(g);
maxloc = start + find(g = gmax);
start = maxloc - BEFORE;
stop = maxloc + AFTER;
h = x(start:stop);
[Pxx, F] = psd(h);
Pxx = Pxx - APxx;
129
F. Matlab code for the object recognition prototype
Below are the five highest-level functions used during the OR prototype construction and
evaluation. Lower-level functions have been omitted in the interest of saving space and
because mathematical and pseudcode explanations of all the classification criteria have
been provided in appendix E. The code is written in Matlab, version 5.
EvaluateOR: The function that coordinates the model building and the prototype evaluation.
function Templates = EvaluateOR(;
CylinderObjectList = [21 22 23 24 35 36 38 39];
SmallShellobjectList = [8 9 10 11 12 13 14 15 16 17 18 19 26 27 28 30];
LargeShellObjectList
=
[1 2 3 4 5 6 7 20 25 29 31 32 33 34 37];
ObjectList = [CylinderobjectList SmallShellObjectList LargeShellObjectList];
* Randomly select the recordings to use for training and then assign
% those unselected to be used for testing.
TrainingArray =
TestingArray =
NumToTry = 15;
for j = 1:NumToTry
Try = ceil(30*rand);
ii = 1;
while (ii <= length(TrainingArray))
if TrainingArray(ii) == Try
ii = 1;
Try = ceil(30*rand);
else
ii
= ii + 1;
end
end
TrainingArray = [TrainingArray Try);
end
TrainingArray = sort(TrainingArray);
for i = 1:30
loc = find(TrainingArray==i);
if isempty(loc)
TestingArray = [TestingArray i];
end
end
disp(['Forming object templates from the training data...']);
TCyl = CreateCylinderTemplates (CylinderObjectList, TrainingArray);
TSma = CreateSmallShellTemplates (SmallShellObjectList, TrainingArray);
TLar = CreateLargeShellTemplates (LargeShellObjectList, TraiiingArray);
Templates = ClassifyObjects(TCyl, TSma, TLar, TestingArray);
NumRight = 0;
NumTotal = 0;
for i = 1:length(Templates)
for ii = 1:length(Templates(i) .Classification)
if Templates(i) .Classification(ii) == Templates(i) .Number
NumRight = NumRight
+ 1;
end
NumTotal = NumTotal + 1;
end
end
Accuracy = (NumRight/NumTotal)*100;
disp('
');
disp(['OR classification accuracy =
disp(['NumRight =
disp(' ');
'
'
num2str(Accuracy) '%');
num2str(NumRight) ';
NumTotal =
d = cd;
130
'
num2str(NumTotal)]);
cd c:\Dan\ORProto;
FPrigtResultsOf0R(Templates, 'Results.dat');
cd(d);
CreateSmallShellTemplates: The function that coordinates the building of the thin-shelled object
templates.
function Templates = CreateSmallShellTemplates(SmallShellObjectArray, TrainingArray);
NumObjs = length(SmallShellObjectArray);
DataPts = length(TrainingArray);
% What regions were found to be important for the PSD
% template matching?
These regions were manually typed in here after
% finding the important regions with the program FindPSDTemplatePieceCombos.
% In future, could automate.
PSDpieces(l).ObjectNum = 8;
PSDpieces(l).Piece(l).p = [1.5e6 2.0e6];
PSDpieces(l).Piece(2).p = [4.0e6 4.6e6];
PSDpieces(2)
PSDpieces(2)
PSDpieces(2)
PSDpieces(2)
.ObjectNum = 9;
.Piece(l).p = [1.4e6 2.0e6];
.Piece(2).p = [4.0e6 4.6e6];
.Piece(3).p = [5.le6 5.8e6];
PSDpieces(3) .ObjectNum = 10;
PSDpieces (3) .Piece(l).p = [1.5e6 2.0e6];
PSDpieces (3) .Piece(2).p = [5.2e6 6.0e6];
PSDpieces(4)
PSDpieces(4)
PSDpieces(4)
PSDpieces(4)
PSDpieces(4)
.ObjectNum
11;
PSDpieces(5)
PSDpieces(5)
PSDpieces(5)
PSDpieces(5)
.ObjectNum = 12;
=
.Piece(l).p
.Piece(2).p
.Piece(3).p
.Piece(4).p
[O.1e6
[1.2e6
= [4.0e6
= [5.2e6
=
=
0.5e6];
2.9e6];
5.0e6];
5.8e6];
.Piece(l).p = [2.2e6 3.0e6];
.Piece(2).p = [3.0e6 3.7e6];
.Piece(2).p = (5.0e6 6.0e6];
PSDpieces(6).ObjectNum = 13;
PSDpieces(6).Piece(l).p = [3.8e6 4.2e6];
PSDpieces(7).ObjectNum = 14;
PSDpieces(7).Piece(l).p = [3.5e6 4.2e6];
PSDpieces(8).ObjectNum = 15;
PSDpieces(8).Piece(l).p = [2.0e6 2.8e6];
PSDpieces(9).ObjectNum = 16;
PSDpieces(9).Piece(l).p = [1.0e6 2.6e6];
PSDpieces(9).Piece(2).p = [5.0e6 5.6e6];
PSDpieces(10).ObjectNum = 17;
PSDpieces(10).Piece(l).p = [0.5e6 1.1e6];
PSDpieces(10).Piece(2).p = [1.le6 1.7e6];
PSDpieces(ll).ObjectNum = 18;
PSDpieces(ll).Piece(l).p = [0.5e6 2.0e6];
PSDpieces(ll).Piece(2).p = [3.1e6 4.2e6];
PSDpieces(12).ObjectNum = 19;
PSDpieces(12).Piece(l).p = [0.7e6 1.2e6];
PSDpieces(13).ObjectNum = 26;
PSDpieces(13).Piece(l).p = [1.3e6 2.0e6];
PSDpieces(14).ObjectNum = 27;
PSDpieces(14).Piece(l).p = [0.7e6 1.5e6];
PSDpieces(15).ObjectNum = 28;
PSDpieces(15).Piece(l).p = [1.5e6 2.3e6];
131
PSDpieces(16).ObjectNum = 30;
PSDpieces(16).Piece(l).p = [2.5e6 3.6e6];
disp('Calculating the acoustic pulse PSD...');
[AcousticPulse, F] = AcousticPulsePSD;
d = cd;
cd c:\Dan\Data\052599;
for i = l:NumObjs
Num = SmallShellObjectArray(i);
Templates(i).Number = Num;
pind = 1;
while PSDpieces(pind).ObjectNum -= Num
pind = pind + 1;
end
for j = 1:length(PSDpieces(pind).Piece)
Templates(i).PSDTemplate(j).StartFreq = PSDpieces(pind).Piece(j).p(l);
Templates(i).PSDTemplate(j).StopFreq = PSDpieces(pind).Piece(j).p(2);
end
disp(['Calculating the transfer function estimate for object #' num2str(Num)]);
[FullPSDTemplate, F] = TrialFullPSDTemplateCreator(Templates(i), TrainingArray,
AcousticPulse);
Templates(i) = TrialPSDTemplateCreator(F, FullPSDTemplate, Templates(i));
disp(['Creating the template for object #' num2str(Num)]);
RatioMatrix = [];
TimeArray =
HssqArray =
for ii = 1:DataPts
DataNum = TrainingArray(ii);
string = ['load ' num2str(Num) '_' num2str(DataNum) '.asc;'];
eval(string);
string = ['curr = X' num2str(Num) '_' num2str(DataNum) ';'];
eval(string);
Ratios = SmallShellRing4(curr);
RatioMatrix = [RatioMatrix; Ratios];
Time = NextPeakLocater(curr);
TimeArray = [TimeArray Time];
[H, F] = MakeEstimateOfH(curr, AcousticPulse);
Hssq = PSDTemplateMatch(F, H, Templates(i).PSDTemplate);
HssqArray = [HssqArray Hssq];
end
Templates(i).TimeMean = mean(TimeArray);
Templates(i).TimeStd = std(TimeArray);
Templates(i).RatioMeanArray = mean(RatioMatrix);
Templates(i).RatioStdArray = std(RatioMatrix);
Templates(i).HssqMean = mean(HssqArray);
Templates(i).HssqStd = std(HssqArray);
end
cd(d);
CreateLargeShellTemplates: The function that coordinates the building of the thick-shelled object
templates.
function Templates = CreateLargeShellTemplates(LargeShellObjectArray, TrainingArray);
PVC = 1;
BRASS = 2;
ALUMINUM = 3;
UNKNOWN = 4;
IDTHRESH
= 2;
% ID deviation threshold so that the data may be
% median filtered prior to template formation.
NumObjs = length(LargeShellobjectArray);
DataPts = length(TrainingArray);
% What regions were found to be important for the PSD
% template matching? These regions were manually typed in here after
% finding the important regions with the program FindPSDTemplatePieceCombos.
132
% In future, could automate.
PSDpieces(1).ObjectNum = 1;
PSDpieces(l).Piece(l).p = [l.0e6 2.0e6];
PSDpieces(l).Piece(2).p = [3.0e6 4.0e6];
PSDpieces(l).Piece(3).p = [4.0e6 5.0e6];
PSDpieces(2).ObjectNum = 2;
PSDpieces(2).Piece(l).p = [0.7e6 2.1e6];
PSDpieces(2).Piece(2).p = [3.0e6 4.0e6];
PSDpieces(3)'.ObjectNum = 3;
PSDpieces(3).Piece(l).p = [0.5e6 2.2e6];
PSDpieces(3).Piece(2).p = [3.0e6 5.0e6];
PSDpieces(4).ObjectNum = 4;
PSDpieces(4).Piece(l).p = [2.4e6 4.0e6];
PSDpieces(4).Piece(2).p = [4.8e6 5.8e6];
PSDpieces(5).ObjectNum = 5;
PSDpieces(5).Piece(l).p =
[0.5e6 2.4e6];
PSDpieces(6).ObjectNum = 6;
PSDpieces(6).Piece(l).p = [0.5e6 2.3e6];
PSDpieces(7).ObjectNum = 7;
PSDpieces(7).Piece(l).p = [0.5e6 2.3e6];
PSDpieces(8).ObjectNum = 20;
PSDpieces(8).Piece(l).p =
PSDpieces(8).Piece(2).p =
[0.5e6 2.4e6);
[2.6e6 4.0e6];
PSDpieces(9).ObjectNum = 25;
PSDpieces(9).Piece(l).p = (0.4e6 1.5e6];
PSDpieces(9).Piece(2).p =
[1.5e6 3.0e6];
PSDpieces(10).ObjectNum = 29;
PSDpieces(10).Piece(l).p = [0.1e6 1.5e6];
PSDpieces(10).Piece(2).p = [2.5e6 3.5e6];
PSDpieces(10).Piece(3).p = [4.1e6 6.0e6];
PSDpieces(ll).ObjectNum = 31;
PSDpieces(ll).Piece(l).p = [0.4e6 1.8e6];
PSDpieces(ll).Piece(2).p = [1.8e6 2.4e6];
PSDpieces(ll).Piece(3).p = [2.4e6 3.5e6];
PSDpieces(ll).Piece(4).p =
[4.8e6 6.0e6];
PSDpieces(12).ObjectNum = 32;
PSDpieces(12).Piece(l).p = [3.8e6 4.7e6];
PSDpieces(12).Piece(2).p = [4.7e6 6.0e6];
PSDpieces(13).ObjectNum = 33;
PSDpieces(13).Piece(1).p = [0.5e6 2.2e6];
PSDpieces(14).ObjectNum = 34;
PSDpieces(14).Piece(1).p = [0.5e6 2.0e6];
PSDpieces(14).Piece(2).p = [2.0e6 4.Oe6];
PSDpieces(14).Piece(3).p = [4.0e6 6.0e6];
PSDpieces(15).ObjectNum = 37;
PSDpieces(15).Piece(1).p =
[0.5e6 1.6e6];
disp('Calculating the acoustic pulse PSD...');
[AcousticPulse, F] = AcousticPulsePSD;
d = cd;
cd c:\Dan\Data\052599;
for i = 1:NumObjs
Num
LargeShellObjectArray(i);
Templates(i).Number = Num;
pind = 1;
while PSDpieces(pind).ObjectNum
-=
Num
pind = pind + 1;
end
for j = 1:length(PSDpieces(pind).Piece)
133
Templates(i).PSDTemplate(j).StartFreq = PSDpieces(pind).Piece(j).p(l);
Templates(i).PSDTemplate(j).StopFreq = PSDpieces(pind).Piece(j).p(2);
end
disp(['Calculating the transfer function estimate for object #' num2str(Num)]);
[FullPSDTemplate, F] = TrialFullPSDTemplateCreator(Templates(i), TrainingArray,
AcousticPulse);
Templates(i) = TrialPSDTemplateCreator(F, FullPSDTemplate, Templates(i));
disp(['Creating the template for object #' num2str(Num)]);
MaterialArray =
WallArray = [];
IDArray = [;
HssqArray = [];
for ii = 1:DataPts
DataNum = TrainingArray(ii);
string = ['load ' num2str(Num) '_' num2str(DataNum) '.asc;'];
eval(string);
string = {'curr = X' num2str(Num) '_'
eval(string);
num2str(DataNum) ';'];
Material = DetermineMaterial(curr);
MaterialArray = [MaterialArray Material];
Wall = LargeShellWallThickness(curr, Material);
ID = LargeShellID(curr, Material, Wall);
if Material
UNKNOWN
WallArray = [WallArray Wall];
IDArray = [IDArray ID];
end
[H, F] = MakeEstimateOfH(curr, AcousticPulse);
Hssq = PSDTemplateMatch(F, H, Templates(i).PSDTemplate);
HssqArray = [HssqArray Hssq];
end
Templates(i).Material = median(MaterialArray);
Templates(i).WallMean = mean(WallArray);
Templates(i).WallStd = std(WallArray);
% IDArray requires median filtering because very infrequently
% there is an outlier which greatly increases the IDStd!
IDArray = MedianFilter(IDArray, IDTHRESH);
Templates(i).IDMean = mean(IDArray);
Templates(i).IDStd = std(IDArray);
Templates(i).HssqMean = mean(HssqArray);
Templates(i).HssqStd = std(HssqArray);
end
cd(d);
CreateCylinderTemplates: The function that coordinates the building of the cylindrical object templates.
function Templates = CreateCylinderTemplates(CylinderObjectArray, TrainingArray);
CYLDIATHRESH = 0.25;
% Threshold to use whi2o deteriining time to
% back wall of a cylinder.
NumObjs = length(CylinderObjectArray);
DataPts = length(TrainingArray);
% What regions were found to be important for the PSD
% template matching? These regions were manually typed in here after
% finding the important regions with the program FindPSDTemplatePieceCombos.
% In future, could automate.
PSDpieces(l).ObjectNum = 21;
PSDpieces().Piece(l).p = [0.5e6 1.1e6];
PSDpieces(l).Piece(2).p = [l.le6 2.1e6];
PSDpieces(l).Piece(3).p = [2.le6 3.8e6];
PSDpieces(l).Piece(4).p = [3.8e6 5.0e6];
PSDpieces(2).ObjectNum = 22;
PSDpieces(2).Piece(l).p = [0.8e6 1.6e6];
PSDpieces(2).Piece(2).p = [1.6e6 2.4e6];
PSDpieces(3).ObjectNum = 23;
PSDpieces(3).Piece(l).p = [0.5e6 2.5e6];
134
PSDpieces(4).ObjectNum = 24;
PSDpieces(4).Piece(l).p = [0.3e6 l.0e6];
PSDpieces(4).Piece(2).p = [1.0e6 4.0e6];
PSDpieces(4).Piece(3).p = [4.0e6 5.0e6];
PSDpieces(5).ObjectNum = 35;
PSDpieces(5).Piece(l).p = [1.3e6 2.le6];
PSDpieces(5).Piece(2).p = [2.1e6 3.2e6];
PSDpieces(5).Piece(3).p = [3.2e6 4.4e6];
PSDpieces(6).ObjectNum = 36;
PSDpieces(6).Piece(l).p = [0.3e6 1.0e6];
PSDpieces(6).Piece(2).p = [l.0e6 4.0e6];
PSDpieces(6).Piece(3).p = [4.0e6 5.0e6];
PSDpieces(7).ObjectNum = 38;
PSDpieces(7).Piece(1).p = [0.5e6 1.5e6];
PSDpieces(7).Piece(2).p = [2.2e6 3.2e6];
PSDpieces(8).ObjectNum = 39;
PSDpieces(8).Piece(l).p = [2.0e6 2.7e6];
PSDpieces(8).Piece(2).p = [2.8e6 4.0e6];
disp('Calculating the acoustic pulse PSD...');
[AcousticPulse, F] = AcousticPulsePSD;
d = cd;
cd c:\Dan\Data\052599;
for i = 1:NumObjs
Num = CylinderObjectArray(i);
Templates(i).Number = Num;
pind = 1;
while PSDpieces(pind).ObjectNum ~
Num
pind = pind + 1;
end
for
j = 1:length(PSDpieces(pind).Piece)
Templates(i).PSDTemplate(j).StartFreq = PSDpieces(pind).Piece(j).p(l);
Templates(i).PSDTemplate(j).StopFreq = PSDpieces(pind).Piece(j).p(2);
end
disp(['Calculating the transfer function estimate for object #' num2str(Num)]);
[FullPSDTemplate, F] = CylinderFullPSDTemplateCreator(Templates(i), TrainingArray,
AcousticPulse);
Templates(i) = TrialPSDTemplateCreator(F, FullPSDTemplate, Templates(i));
disp(['Creating the template for object #' num2str(Num)]);
DiaTimeArray = [];
HssqArray = [];
for ii = 1:DataPts
DataNum = TrainingArray(ii);
string = ['load ' num2str(Num) '_' num2str(DataNum) '.asc;'];
eval(string);
string = ('curi , X' num2str Num; '_'
num2str(DataNum) ';'];
eval(string);
DiaTime = CylinderThickness(curr, CYLDIATHRESH);
DiaTimeArray = (DiaTimeArray DiaTime];
[H, F] = CylMakeEstimateOfH(curr, AcousticPulse);
Hssq = PSDTemplateMatch(F, H, Templates(i).PSDTemplate);
HssqArray = [HssqArray Hssq];
end
Templates(i).DiaTimeMean = mean(DiaTimeArray);
Templates(i).DiaTimeStd = std(DiaTimeArray);
Templates(i).HssqMean = mean(HssqArray);
Templates(i).HssqStd = std(HssqArray);
end
cd(d);
135
ClassifyObjects: The function that applies the templates and selects the most similar object to be the
classification.
function Templates = ClassifyObjects(TCyl,
TSma,
TLar,
TestingArray);
PVC = 1;
BRASS = 2;
ALUMINUM = 3;
UNKNOWN = 4;
CylinderObjectList = [];
for i = 1:length(TCyl)
CylinderObjectList = [CylinderObjectList TCyl(i).Number];
end
SmallShellObjectList = [];
for i = 1:length(TSma)
SmallShellObjectList = [SmallShellObjectList TSma(i).Number];
end
LargeShellObjectList = [];
for i = 1:length(TLar)
LargeShellObjectList =
[LargeShellObjectList TLar(i).Number];
end
ObjectList = [CylinderObjectList SmallShellObjectList LargeShellObjectList];
NumObjs = length(ObjectList);
DataPts = length(TestingArray);
%disp(ObjectList)
disp('Calculating the acoustic pulse PSD...');
[AcousticPulse, F] = AcousticPulsePSD;
d = cd;
cd c:\Dan\Data\052599;
for i
1:NumObjs
Num = ObjectList(i);
disp(['Classifying the testing data for object #'
Templates(i).Number = Num;
num2str(Num)]);
SearchSpace = El;
for ii = 1:DataPts
DataNum = TestingArray(ii);
string = ['load ' num2str(Num) '_' num2str(DataNum) '.asc;'];
eval(string);
string = ['curr = X' num2str(Num) '_' num2str(DataNum) ';'];
eval(string);
SmallScores = CalcSmallScores(TSma, curr, AcousticPulse);
Material = DetermineMaterial(curr);
if Material == UNKNOWN
SearchSpace = [SmallSnellObjectList CylinderObjectList);
CylinderScores = CalcCylinderScores(TCyl, curr, AcousticPulse);
Scores = [SmallScores(l,:) CylinderScores(l,:)];
else
SearchSpace = [SmallShellObjectList LargeShellObjectList];
LargeScores = CalcLargeScores(TLar, curr, Material, AcousticPulse);
Scores = [SmallScores(l,:)
end
% This is
LargeScores(l,:)];
like golf - lowest score wins!
(score represents std
% from Lhe Lemplate)
MinScore = min(Scores);
Index = find(Scores==MinScore);
Classification = SearchSpace(Index);
Templates(i).Classification(ii) = Classification;
Templates(i).Scores(ii).list = Scores;
Templates(i).SmallScores(ii).list = SmallScores(l:end,:);
if Material == UNKNOWN
Templates(i).CylinderScores(ii).list = CylinderScores(l:end,:);
Templates(i).LargeScores(ii).list = [];
else
Templates(i).CylinderScores(ii).list
[];
Templates(i).LargeScores(ii).list = LargeScores(l:end,:);
136
end
dispf(['Data piece #' num2str(TestingArray(ii))
'num2str (Classification)]);
disp(Scores);
end
end
cd(d);
137
':
Classification =
References
1.
B. Kamgar-Parsi, B. Johnson, D. L. Folds, and E. 0. Belcher. High-Resolution Underwater Acoustical
Imaging with Lens-Based Systems. InternationalJournal of Imaging Systems and Technology, 8:377-
385, 1997.
2.
D. F Lohmeyer. 1998. Signal Processing in a Real-Time Three-Dimensional Acoustical Imaging
System. Master of Engineering Thesis, Massachusetts Institute of Technology.
3.
K. Erikson, et al. 1999. Imaging with an underwater acoustical camera. Proceedings,SPIE Cinference
on Information Systems for Navy Divers and Autonomous Underwater Vehicles, Orlando. To be
published.
4.
V. 0. Knudsen, R. S. Alford, and J. W. Emling. Underwater ambient noise. Journal.of Marine
Research. 7:410-429, 1948.
5.
W. W. L. Au and K. Banks. The acoustics of the snapping shrimp Synalpheus parneomeris in Kaneohe
Bay. Journal of the Acoustical Society of America. 103(1):41-47, 1998.
6.
ADSP-2106x SHARC User'sManual. Analog Devices, Inc. Norwood, MA. 1995.
7.
R. T. Beyer. NonlinearAcoustics. Acoustical Society of America. New York, 1997.
8.
K. Erikson. 1999. Private Communication. Lexington, MA: Lockheed Martin IR Imaging Systems.
9.
Jane's Underwater Warfare Systems.
Alexandria, VA, 1996.
10. P. M. Morse and K. U. Ingard.
Edited by A. J. Watts.
Jane's Information Group Inc.
Theoretical Acoustics. McGraw-Hill Book Company.
New York,
1968.
11. T. F. Hueter and R. H. Bolt. Sonics: Techniques for the Use of Sound and Ultrasound in Engineering
and Science. John Wiley & Sons, Inc. New York, 1955.
12. Matlab (IBM PC and Compatibles Version 5.2.0.3084), [Computer Program]. 1998. The MathWorks
Inc., Natick, MA.
13. S. M. Tuovila, S. R. Nelson, and C. M. Smith. Automated Target Classification and False Targetr
Rejection in AN/AQS- 14 Sonar Images. U.S. Navy Journalof UnderwaterAcoustics. 47(2):895-903,
April 1997.
14. R. N. Czerwinski, D. L. Jones, and W. D. O'Brien. Line and Boundary Detection in Speckle Images.
IEEE Transactionson Image Processing.7(12):1700-1714, December 1998.
15. T. F. Heuter and R. H. Bolt. 1955. Sonics. New York: John Wiley & Sons, Inc.
16. B. A. Auld. 1990. Acoustical Fields and Waves in Solids, Volume II. Malabar, FL: Krieger Publishing
Company.
17. N. Yen, L. R. Dragonette, S. K. Numich. Time-frequency analysis of scattering from elastic objects. J.
Acoustical Society of America. 87 (6) p. 2359-2370, June 1990.
138
Download