Computerized Respiratory Sound Analysis: An Exploration

advertisement
Computerized Respiratory Sound
Analysis: An Exploration of Methods
Thesis by Urvi Patel
CWRU School of Engineering, Department of Physics
Advisor: Dr. Ronald Cechner
Anesthesiology, University Hospitals
CWRU Biomedical Engineering Department
Wednesday, May 2, 2011
1
Abstract
The presence and detection of abnormal breath sounds is an indicator of a variety of
lung diseases, including pneumonia and chronic bronchitis. Traditionally, abnormal
breath sounds are detected using stethoscopes and qualitative methods based on a
physician’s own hearing. Computerized methods of respiratory sound analysis will
provide a quantitative basis for abnormal respiratory sound detection. The overall goal
of this project is to progress towards building a non-obstructive device that can
continuously monitor respiratory signals in order to detect and accurately classify
wheezes, crackles, and normal breath sounds. As a first step in this process, we
analyzed pre-recorded, high-preclassified breath sounds from the R.A.L.E. Repository
using two methods: (1) Fast Fourier Transforms and (2) Wavelet Transforms in
conjunction with Artificial Neural Network Classification . The goal of this analysis has
been to identify the key distinguishing features of the primary breath sounds: wheezes,
crackles, and normal sounds and to determine the effectiveness of the two methods.
The future direction of this project depends on developing good signal analysis
techniques required to efficiently detect respiratory signals.
2
Contents
ABSTRACT
2
CONTENTS
3
LIST OF FIGURES
4
PREFACE
5
CHAPTER 1: INTRODUCTION TO RESPIRATORY SOUND ANALYSIS
7
CHAPTER 2: FOURIER TRANSFORMS IN THE ANALYSIS OF RESPIRATORY SIGNALS 11
CHAPTER 3: WAVELET TRANSFORM ANALYSIS
20
CHAPTER 4: CONCLUSIONS & FUTURE WORK
37
CHAPTER 5: DISCUSSION OF RESULTS & FUTURE WORK
37
REFERENCES
39
3
List of Figures
Figure 1: Fast Fourier Transform of a Normal Sound Example 1. ................................................ 15
Figure 2: Fast Fourier Transform of a Normal Sound, Example 2. ............................................... 16
Figure 3: Fast Fourier Transform of Wheezing, Example 1. .......................................................... 16
Figure 4: Fast Fourier Transform of Wheezing, Example 2. .......................................................... 16
Figure 5: Fast Fourier Transform of Crackles, Example 1. ............................................................ 16
Figure 6: Fast Fourier Transform of Crackles, Example 2. ............................................................ 16
Figure 7: The Daubechies family of wavelets from the 2nd order to the 10th order. ..................... 22
Figure 8: Decomposition tree of the Discrete Wavelet Transform. ............................................... 25
Figure 9: Illustration of an Artificial Neural Network Training Model ............................................. 26
Figure 10: Illustration of a Multi-level Artificial Neural Network. .................................................... 27
Figure 11: Confusion Matrices for one Trial Run of the Artificial Neural Network........................ 35
4
Preface
The roots of this project dig back to the summer of 2009, after my sophomore year at
Case Western Reserve University. That summer, two other students, Joe Karas and
Stefan Maurer, and I built the first prototype of what we called an “Abnormal Respiratory
Sound Detector” as a part of a summer engineering design experience sponsored by
the CWRU Biomedical Engineering Department, the Case Alumni Association, and
Case’s Rising Engineers and Technological Entrepreneurs (C.R.E.A.T.E.).
Under the advisement of both Dr. Ronald Cechner and Dr. Dustin Tyler, we learned
indispensable engineering techniques and rudimentary signal analysis to build a crude
“proof-of-concept” device. We learned many lessons that year; most important, the
lesson of plowing through the “inexperience roadblock” in favor of progress. Although
we tried to overcome the challenges of our inexperience with signal analysis and
design, we did not succeed in our efforts to reliably detect and classify abnormal breath
sounds in real-time that summer.
Two years later, I chose to revisit my effort to detect abnormal breath sounds. By then,
I had a greater concept of digital systems design, but my understanding of signal
analysis techniques has always been shaky. I wanted to focus on developing the heart
of the prototype my team and I aimed to build in 2009: signal analysis, with an eye
towards accurately classifying signals, and algorithm development—as opposed to
prototype hardware design. Thus, my senior project was born. - Urvi Patel
5
Acknowledgements
This project would not have been possible without the support of and collaboration with
my original project team: Stefan Maurer and Joe Karas. The original project would not
have been possible without the generous support by CWRU Biomedical Engineering
Department, the Case Alumni Association, C.R.E.A.T.E. and Dr. Dustin Tyler.
That said, during the resurrection of the original project, I have received unrelenting
support from Dr. Rolfe Petschek and the CWRU Physics Department. Without their
support, I would still be stuck understanding how to apply Fourier Transforms properly.
Finally, I would like to thank Dr. Ronald Cechner, who has been my patient advisor and
mentor since the birth of the original project and throughout the project’s reincarnation
as my senior project. This project is his brainchild, and I hope that my work here will be
a useful reference for anyone who chooses to contribute to this study.
6
Chapter 1:
Introduction to Respiratory Sound
Analysis
History of Respiratory Signal Analysis
The importance of listening to and understanding respiratory sounds is evident from the
iconic and symbolic usage of the stethoscope in modern medicine. The stethoscope
was invented in 1821 by the French Physician, Laennec, upon the discovery that
respiratory sound analysis aids in the diagnosis of pulmonary infections and diseases,
such as acute bronchitis and pneumonia [2, 9]. Since 1821, stethoscopes have become
the most common diagnostic tool by doctors in the twenty-first century [7, 9]. Despite its
widespread use, however, analysis of respiratory sounds using stethoscopes is
rudimentary at best and requires a degree of subjectivity from the physician [6, 7, 9].
Analysis of respiratory sounds using stethoscopes depends on the variable factors of
the diagnosing physician’s experiences, hearing, and ability to recognize and
differentiate patterns [9]. In addition, stethoscope data is not typically recordable,
making long-term correlation of data difficult [6, 9]. All of these factors reduce the value
stethoscopes bring to a world that increasingly demands quantitative measures of
disease.
7
Over the last four decades, researchers have made significant progress in fine-tuning
computerized signal processing techniques, and it is now possible to perform
respiratory sound analysis using many of these techniques [7]. Computerized analysis
has succeeded in producing graphical representations of respiratory signals, which
provide physicians additional methods of pulmonary diagnosis [7]. The use of
computational power to analyze pulmonary spectral data has advanced respiratory
sound analysis from a subjective skill to an objective one [8, 9]. In addition, the recent
availability of cheap computer memory has enabled permanent storage of recorded
respiratory sounds [8]. The next step in computerized analysis of respiratory signals is
to automate the classification of respiratory sounds based on real-time data.
8
Types of Respiratory Sounds
Respiratory signals can be classified into two major categories: normal lung sounds
(NLS) and abnormal lung sounds (ALS) [10]. Most abnormal lung sounds are both
adventitious and nonstationary. While many types of abnormal lung sounds exist, the
two major categories of abnormal lung sounds are wheezes and crackles [10].
A wheeze is a continuous adventitious sound that is characteristically “musical” in
nature [4]. Wheezing is usually caused by airway obstruction in the lungs [10]. The
presence of wheezes during breathing can indicate asthma, cyctic fibrosis, and
bronchitis in a patient [4, 10]. Wheezes are high-pitched in relation to normal breath
sounds, and their frequency distribution is usually between the 400 Hz to 600 Hz range
[10]. They typically last for longer than 100 ms [4]. Because wheezes have a defined
frequency range, frequency domain analysis of a respiratory signal can reveal a
wheeze.
A crackle is a discontinuous adventitious sound that is characterized by sharp bursts of
energy [4]. Their duration is typically shorter than 20 ms, and they are characterized by
a wide distribution of frequencies [4]. Because of this wide frequency distribution, it is
difficult to pinpoint crackles in the frequency domain. Crackles can be broken down into
two additional categories. Fine crackles are high-pitched crackles that occur repeatedly
over inspiration across multiple breathing cycles [10]. Coarse crackles are low-pitched
sounds that appear early during inspiration or sometimes during expiration as a result of
liquid filling small airways in the lungs [10]. The presence of crackles can indicate
cardiorespiratory diseases, pneumonia, and chronic bronchitis [4].
9
The Near Future of Respiratory Signal Analysis
Over the last thirty years, various methods of computerized respiratory sound analysis
have attempted to distinguish and classify abnormal lung sounds. These methods
include the usage of Fast Fourier Transforms, Short Time Fourier Transforms, Wavelet
Transforms, fuzzy logic classification, autoregressive modeling and neural network
classification—among others. The study of computerized detection methods for
abnormal lung sounds continues to grow as computer classification algorithms become
more sophisticated over time. This project explores the detection of abnormal lung
sounds in respiratory signals using the Fast Fourier Transform and the Wavelet
Transform (in conjunction with neural network classification).
10
Chapter 2:
Fourier Transforms in the Analysis of
Respiratory Signals
Introduction
The Fourier Transform is known to be one of the most widely-used techniques in signal
analysis. Applying a Fourier transform to an input signal yields a representation of the
major frequency components of the signal. In respiratory signal analysis, the Fourier
transform is particularly useful in revealing the presence of wheezes because wheezes
occur in a known frequency band between 400 Hz and 600 Hz [10]. The Fourier
transform is less useful for pinpointing crackles, because crackles have a wide
frequency distribution. This chapter illustrates the use of Fourier analysis in the
detection of both wheezes and crackles based on our experimental data.
11
Overview of Fourier Transforms
Understanding the Fourier Transform is an important part of understanding why they are
useful in signal analysis—especially in the analysis of sinusoid-like signals such as
wheezes that occur in a narrow band of frequencies.
The basis of the Fourier Transform is the sine wave [4]. Mathematically, the Continuous
Fourier Transform is described as [4]:
( ) =∫
( )
(1)
In Equation 1, ( ) is the input signal [4]. In respiratory signal analysis, this input signal
is usually a time domain representation of the signal: amplitude of the sound as a
function of time [4]. A microphone recording of a respiratory signal is a time domain
representation.
Referring back to Equation 1, the input signal is then multiplied by a complex
exponential [4]. Recall that, according to Euler’s formula, the complex exponential can
be broken into real and complex sinusoidal components [4]:
= cos(
)+
(
)
(2)
In Fourier Analysis, integrating the multiplication of the input signal to the complex
exponential over all time is similar to finding an “inner product” of two vectors: The
equation calculates a set of coefficients that describe the “similarity” of the input signal
to the complex exponential [4]. In other words, the Fourier coefficients, ( ) , store
how similar the input signal is to a series of sinusoids with frequency
12
[4].
Equation 1 describes the Continuous Fourier Transform [4]. Several other variations of
the Continuous Fourier Transform exist, including Short Time Fourier Transform and
Discrete Fourier Transforms [4]. The general concept of applying these variations to
input signals is the same as described above [4]. The Short Time Fourier Transform
adds a “windowing” mechanism that allows Fourier analysis over shorter segments of
time [4]. The Discrete Fourier Transforms is useful for computerized calculations,
where the integrals must be replaced by numerical summations [4]. An extremely
efficient algorithm that calculates the Discrete Fourier Transform, known as the Fast
Fourier Transform, can be used for computerized Fourier analysis of respiratory signals
[4].
13
Methods
A set of pre-recorded, pre-classified respiratory signals were obtained from the R.A.L.E.
Repository [9]. The R.A.L.E. Repository hosts a set of well-documented, 10-second
respiratory signals [9]. The set of signals includes three normal sound files, two crackle
sound files, and two wheeze sound files. The signals are pre-filtered with an analog
low-pass filter at 2500 Hz and were sampled at 11025 samples/second.
Using MATLAB, Fast Fourier Transform was applied to every 50-ms intervals of the
input signal for each of the sample signals obtained from the R.A.L.E. Repository.
Time domain and frequency domain plots were generated in for each of these 50-ms
segments to visualize the data.
14
Results
Some example plots resulting from applying the Fast Fourier Transform to various types
of respiratory signals are shown in this section. The plots reveal both time domain and
frequency domain data across 50 ms of data. Approximately 200 plots were produced,
each representing from performing a Fast Fourier Transform over 50 ms of the 10
seconds of data available for each sound file.
Figure 1: Fast Fourier Transform of a Normal Sound Example 1. The signal in the time domain is very
smooth. The frequency domain representation reveals that most frequencies are below 100 Hz.
15
Figure 2: Fast Fourier Transform of a Normal Sound, Example 2. As in the earlier example, the time
domain is smooth, and the frequency domain consists mostly of frequencies below 100 Hz.
16
Figure 3: Fast Fourier Transform of Wheezing, Example 1. Like the normal sound, the time domain of
the signal is devoid of discontinuities; however, the frequency of the signal has increased. The frequency
domain plot reveals major frequency components around 400 Hz. This pitch matches the description for
a wheeze sound, defined earlier.
Figure 4: Fast Fourier Transform of Wheezing, Example 2. As in Figure 3, the frequency domain plot
reveals a major frequency component around 400 Hz.
17
Figure 5: Fast Fourier Transform of Crackles, Example 1. Unlike wheezes and normal sounds, crackles reveal
sharp discontinuities in the time domain. The frequency domain plot reveals a wide range of major frequency
components. This description matches crackle characteristics.
Figure 5: Fast Fourier Transform of Crackles, Example 2. Again, crackles reveal sharp discontinuities in the time
domain. The frequency domain plot reveals a wide range of major frequency components. This description again
matches crackle characteristics.
18
Based on visual inspection of the plots in this section, crackles, wheezes, and normal
sounds were successfully identified. Wheezes were revealed in the frequency domain
as sharp peaks around 400 Hz. While Fourier analysis was unable to pinpoint crackles,
crackles were visually classified by noting the discontinuities in the time domain signal
and the wide frequency distribution in the frequency domain.
19
Chapter 3:
Wavelet Transforms in the Analysis of
Respiratory Signals & Neural Network
Classification
Introduction
An increasing number of studies are now exploring respiratory sound analysis using
Wavelet Transforms. Unlike Fourier Transforms, Wavelet Transforms are able to detect
both sharp discontinuities in a signal, such as crackles. As an added benefit, Wavelet
Transforms can also detect gradual, sinusoid-like characteristics of a signal, such as
wheezes. Unlike the output of Fourier Transforms, however, the output data from
Wavelet Analysis is difficult to visualize.
Once a Wavelet transform is applied to a signal, the output of the wavelet transform
(namely, the wavelet coefficients) may be analyzed. This analysis can be combined
with various pattern recognition schemes, such as Artificial Neural Network
Classification, to classify the input signals. The following sections provide the details of
the application of Wavelet Transforms, in conjunction with Artificial Neural Network
classification, to analyze respiratory signals.
20
Overview of Wavelet Transforms
Like the Fourier Transform, the Wavelet Transform compares an analyzing function
against an input signal [4]. Continuing the earlier discussion, the mathematical
representation of a 1D Continuation Wavelet Transform is [4]:
( , )=
| |
∫ ( )
∗
(3)
As I noted earlier, the analyzing function for a Fourier transform is the sine wave [4].
During Fourier analysis, the input signal is broken into a series of sinusoids of various
frequencies [4]. Similarly, the Wavelet Transform’s analyzing function is a wavelet,
represented by by
in Equation 3 [4]. See Figure 7 later in this section for a set of
examples wavelets [4]. The input signal, denoted by x(t) in Equation 3, is compared to
scaled and shifted versions of the analyzing wavelet [4]. In respiratory signal analysis,
the input signal is typically a time domain representation of the signal [4].
As with Fourier analysis, performing Wavelet analysis is similar to calculating the “inner
product” of the input signal and scaled and shifted versions of the wavelet [4]. The
measure of “similarity” obtained from this calculation is stored in a set of Wavelet
Transform Coefficients (CWT) [4].
One of the main differences between the Fourier Transform and the Wavelet Transform
is that the Wavelet Transform takes into account the “scale” of the analyzing wavelet [4].
The scale is denoted by s in Equation 3 [4]. During Wavelet Analysis, Wavelet
Transform Coefficients are calculated by comparing the input signal to scaled and
21
shifted versions of the analyzing wavelet [4]. “Scaling” the wavelet involves stretching
or compressing the wavelet [4]. “Shifting” the wavelet involves moving the wavelet
along the time axis of the input signal [4]. By scaling and shifting the wavelet across the
input signal, the input signal is compared to a variety of wavelet characteristics [4].
Unlike sinusoids, wavelets are irregular and asymmetrical [4]. Many different types of
wavelets have been developed for Wavelet Analysis [4]. Figure 7 below illustrates just
one family of wavelets: the Daubechies family. Applications involving wavelet
transforms may even involve custom-designed wavelets [4].
Figure 3: The Daubechies family of wavelets from the 2nd order to the 10th order. In respiratory signal
analysis, wavelet analysis with 8th order Daubechies wavelets have produced good results (see the
“Review of Literature” section). [Souce: 4]
The selection of which wavelet to use depends on the input signal [4]. If the input signal
contains many discontinuities, as in a crackle sound, the analyzing wavelet that will best
represent the input signal will be sharp [4]. If the input signal is smooth, such as a
wheeze, a smooth wavelet may be chosen containing few discontinuities [4].
22
A benefit of scaling the wavelet during the calculation is that scaling will account for the
frequency distribution of the input signal [4]. Stretched wavelets will compare better to
sinusoid-like, slowly-varying signals like wheezes, while compressed wavelets will
compare better to sharply-varying signals like crackles [4]. This implies that the timescale analysis of the input signal employed by the Wavelet Transform automatically
corrects for the frequency distribution [4].
Applying the Continuous Wavelet Transform progresses in the following manner [4]:
1. The analyzing wavelet is compared to the beginning of the input signal, x(t).
2. The Wavelet Transform Coefficients are calculated based on the similarity of the
analyzing wavelet to the input signal. A wavelet coefficient of value “0”
corresponds to “zero similarity.” The wavelet coefficient cannot exceed a value
of 1.
3. Shift the wavelet along the time axis to the next section of the input signal.
Repeat Step 2. Continue shifting and calculating wavelet coefficients until the
end of the input signal.
4. Scale the original wavelet by compressing it or stretching it. Repeat Steps 1-3
until the wavelet has been scaled by all scale values.
Equation 3 describes the 1D Continuous Wavelet Transform [4]. Given a maximum
scale value, s, the Continuous Wavelet Transform will calculate Wavelet Transform
Coefficients for every scale value between 1 and s [4]. Additionally, the Continuous
Wavelet Transform will shift smoothly over the entire time axis of the input signal [4].
23
This implies that calculating Wavelet Transform Coefficients using Continuous Wavelet
Transforms is a computationally-intensive process [4].
An alternative to the Continuous Wavelet Transform is the Discrete Wavelet Transform,
which uses a subset of scales and shifts to calculate the Wavelet Transform
Coefficients [4]. An efficient algorithm used to calculate wavelet coefficients from the
Discrete Wavelet Transform is the Mallat algorithm [4].
The Mallat algorithm shortcuts the need to scale the wavelet at every possible value
from 0 to s, as discussed previously [4]. This eliminates the need for Step 4 from the
earlier discussion. The algorithm does this by “decomposing” the input signal by
passing it through a high pass filter and a low pass filter [4]. The output from the high
pass filter reveals the high-frequency, low-scale Details of the original signal [4]. The
output from the low pass filter is the low-frequency, high-scale Approximation of the
original signal [4]. Because scaling accounts for the frequency distribution (and vice
versa), the algorithm bypasses “scaling” the wavelet in this manner. Wavelet Transform
Coefficients are calculated for both the Detail and Approximation signals [4]. The
decomposition process is then reapplied to the Approximation signal for a given number
of decomposition levels [4]. Figure 8 provides a general overview of the method.
24
Figure 4: Decomposition tree of the Discrete Wavelet Transform. The original signal, S, undergoes a
high-pass filter and a low-pass filter. The output of the low-pass filter is the Approximation of the signal,
producing wavelet coefficients cA1, cA2, and cA3. The next set of filters is applied to the Approximation
signals. The output of the high-pass filters are the Details of the signals, producing the wavelet
coefficients cD1, cD2, and cD3. [Source: 4]
25
Overview of Artificial Neural Network Classification
Artificial Neural Networks can be employed to perform complex pattern recognition
tasks [3]. The neural network consists of a set of elements called neurons that operate
in parallel on a set of inputs to produce a set of outputs [3]. The outputs the neural
network are compared to a set of “target” values. Once a comparison of the outputs
and the targets has been performed, the neural network is adjusts a set of weights and
biases so that the neural network output will better predict the target data [3]. Figure 9
illustrates this process, which is known as the training stage of neural network design.
Figure 5: Illustration of an Artificial Neural Network Training. An input signal is fed into the neural
network, which produces a set of outputs based on the calculation of transfer functions inside the neural
network model. The outputs from the neural network are compared against a set of “target” values. The
neural network is adjusted to better match these target values. [Source: 3]
After a neural network has been trained, the neural network can validate and test the
network using pre-classified inputs to evaluate the network’s final performance [3].
26
Figure 10 illustrates a multi-level neural network. The input data is fed into a “neuron”
that multiplies the input by a weight and adds it to a bias [3]. This value is passed into a
transfer function to produce a set of outputs [3]. In a multi-level neural network, the
outputs are then fed into another layer of neurons [3]. The final set of outputs will be
compared to the “target” dataset during training [3].
Figure 6: Illustration of a Multi-level Artificial Neural Network. The input signals p1-x are multiplied by a
set of weights, w, and added to a set of biases, b. These values are then sent into the transfer function.
The output of the transfer function are applied to the second layer of neurons. The outputs from the third
and final layer of neurons will be compared against the “target” values, as shown in Figure 9. [Source: 3]
27
In summary, neural network design consists of five stages [3]:
1. Collect Data: “Input” data is gathered and “Target” data is generated for a set of
sample inputs.
2. Create and configure the initial network.
3. Initialize the weights and biases.
4. Train and validate the network using the sample “input” data and “targets.” The
neural network algorithm will adjust the weights and biases to product outputs
that will better match the “target” data. Various neural network training
algorithms exist, but an analysis of the various methods is beyond the scope of
this project.
5. Validate and test the network. Evaluate its performance. Retrain the network if
the evaluation is poor, but this may not fix the issue.
6. Use the network on new, unclassified data.
Tools such as MATLAB’s Neural Network Toolbox can automatically perform Steps 2, 3,
and 4 above based on a set of “input” and “target” datasets.
28
Review of Literature
Kandaswamy, et al. applied a 7-level Discrete Wavelet Transforms on a set of prerecorded respiratory sounds, broken into inspiration/expiration cycles [2]. They
proceeded to extract a set of four statistical features from the wavelet coefficients [2]:
1. Mean value of the wavelet coefficients
2. Average power of the wavelet coefficients
3. Standard deviation of the wavelet coefficients
4. Ratio of the mean values of the wavelet coefficients in adjacent decomposition
levels
The first two of these statistical features represent the frequency distribution of the
signals [2]. The last two statistical features represent the frequency variation of the
signals [2].
Kandaswamy fed these statistical features into a multi-layer artificial neural network to
successfully classify the respiratory signals into six categories: normal, wheeze,
crackle, squawk, stridor, or rhonchus [2].
By experimenting with various analyzing wavelets neural network training algorithms,
they compiled the following set of recommendations for respiratory signals analysis
using wavelet analysis in conjunction with neural network classification: They
determined that the 8th-order Debauchies wavelet was yielded the best results during
wavelet analysis. They also determined that a multi-level neural network with 40 hidden
neurons using a tan-sigmoid transfer function for the first layer and a log-sigmoid
29
transfer function for the second layer was the optimal neural network architecture to use
[2]. They trained their neural network using the resilient backpropagation training
algorithm [2].
Various other groups have used Kandaswamy’s methods and results to train neural
networks to classify respiratory signals. Hashemi, et al. used similar methods to
successfully classify wheeze sounds with 89.28% accuracy. Hashemi’s team used
extracted two additional statistical features from the wavelet coefficients: (1) the
skewness of each wavelet decompositions and (2) the kurtosis—or the “degree of
peakedness”—of the wavelet decompositions.
30
Methods
Once again, the analysis was performed on a set of five pre-recorded, pre-classified
respiratory sounds. This time, each of the 10-second sound samples was broken into
segments of inspiration/expiration breathing cycles. This breakdown yielded a total of
22 breathing cycles, including 8 normal breathing cycles, 5 wheeze breathing cycles,
and 9 crackle breathing cycles.
A 1D Discrete Wavelet Transform was applied to each of the 22 breathing cycle
samples using MATLAB’s Wavelet Toolbox. Using the recommendations from
literature, the 8th-order Daubechies family wavelet using and 7 levels of decomposition
was chosen for analysis [2].
A Discrete Wavelet Transform was applied to each of the 22 breathing cycles. The
application yielded a set of wavelet coefficients for each of the seven decompositions of
the input breathing cycle. These wavelet coefficients were saved in a set of 22
MATLAB vectors that corresponded to each of the 22 breathing cycles analyzed. Three
statistical characteristics were calculated from these 22 vectors:
1. The mean of the wavelet coefficients for each vector
2. The average power of the wavelet coefficients for each vector
3. The standard deviation of the wavelet coefficients for each vector
The mean of the wavelet coefficients (feature 1) and the average power of the wavelet
coefficients (feature 2) correspond to the frequency distribution of the breathing cycle
31
[2]. The standard deviation of the wavelet coefficients (feature 3) measures frequency
variation of the breathing cycle [2].
These statistical features for each breathing cycle sample were combined in a 22x3
matrix. The 22 rows of this matrix correspond to the 22 breathing cycles. The 3
columns correspond to each of the three statistical features extracted from the wavelet
coefficients.
This 22x3 matrix serves as the “input” matrix for an Artificial Neural Network optimized
for pattern recognition. The first 8 rows of this matrix correspond to statistical data from
the normal breathing cycles. The next 5 rows of this matrix correspond to statistical
data from the wheeze breathing cycles. The final 9 rows of this matrix correspond to
the statistical data from the crackle breathing cycles. The structure of this “input” matrix
is important because it must match up to the classification in the “target” matrix, as
explained in the next few paragraphs.
Sample 1
Mean of CWT
Sample 1
Average Power of CWT
Sample 1
Standard Deviation of CWT
Sample 2
Mean of CWT
Sample 2
Average Power of CWT
Sample 2
Standard Deviation of CWT
…
…
…
Sample 22
mean of CWT
Sample 22
Average Power of CWT
Sample 22
Standard Deviation of CWT
Table 1 Organization of the 22x2 “input” matrix into the artificial neural network. The matrix consists of 22
rows, one row for each sample of breath cycle samples. The matrix consists of three columns that store
the mean of the Wavelet Transform Cofficients (CWT), the Average Power of the Wavelet Transform
Coefficients, and the Standard Deviation of the Wavelet Transform Coefficients.
32
The “target” matrix is also a 22x3 matrix. The matrix contains 22 rows to store data on
the 22 breathing cycles analyzed. Unlike the “input” matrix, however, the columns of
the “target” matrix refer to the classification of the breathing cycle. A “1” in the first
column indicates that the input signal is a normal breathing cycle. A “1” in the second
column indicates that the input signal is wheeze breathing cycle. Finally, a “1” in the
third column indicates that the input is a crackle sound.
1
0
0
0
0
1
0
1
0
0
0
1
Table 2: 4x3 example target matrix
For example, the 4x3 target matrix, target = [1, 0, 0; 0, 0, 1; 0, 1, 0; 0, 0, 1] (see Table 2
above) means that that the first input is classified as a normal breathing cycle, the
second and fourth inputs are classified as crackle breathing cycles, and the third input is
classified as a wheeze breathing cycle.
A 22x3 “target” matrix was constructed to correspond to the “input” matrix described
earlier. The first 8 rows of the matrix were classified as “normal” with a “1” in the first
column of the matrix. The next 5 rows of the matrix were classified as “wheezing” with a
33
“1” in the second column of the matrix. The remaining 9 rows of the matrix were
classified as “crackles” with a “1” in the last column of the matrix.
The “input” and “target” matrices serve as inputs into an Artificial Neural Network, which
uses these matrices to for training, validation, and testing.
The MATLAB Neural Network Toolbox GUI was used to configure an Artificial Neural
Network optimized for pattern recognition. Pattern recognition was performed using a
gradient backprojection algorithm and a multilayer Artificial Neural Network containing
21 hidden neurons. The neural network was trained using 70% of the 22 breathing
cycle samples (15 samples total). Validation of the neural network was performed using
15% of the samples (3 samples total). Another 15% of the samples (3 samples total)
were used to test the neural network accuracy.
After each stage (training, validation, and testing), a “Confusion matrix” was generated
to evaluate the performance of the neural network (see the Results section for an
example). Misclassification statistics are marked with the color red on the matrix [3].
Accurate classifications are marked with the color green on the matrix [3]. The overall
performance of the matrix is marked in a blue box at the bottom right corner of the
matrix [3].
The same neural network configuration was trained several times. For each run, a
Confusion matrix was generated. Accuracy and Misclassification statistics for each run
were compared against each other to determine the overall accuracy of classification
using the neural network scheme.
34
Results
An example of the generated “Confusion matrix” evaluating the Neural network
classification scheme is outlined below. None of the classification schemes resulted in
classification accuracy greater than 50%. The results consistently resulted in a
classification accuracy between 38% and 45%.
Figure 7: Confusion Matrices for one Trial Run of the Artificial Neural Network. A matrix is generated for
each of the three stages of neural network training: training, validation, and testing. The red boxes
indicate percentage and number of samples that were misclassified at each stage. The green boxes
indicate the percentage and number of samples that were classified accurately at each stage. The blue
box evaluates the neural network’s overall performance at each stage. The overall accuracy of the neural
network classification scheme designed was 45.5% for all three stages.
35
Based on these results, the neural networked configured in the previous section was
unable to reliably classify the input signals. One possible issue with the artificial neural
network classification scheme could be that an insufficient number of samples were
used for neural network training. Only 22 samples (8 normal breathing cycles, 5
wheeze breathing samples, and 9 crackle breathing cycles) were available for neural
network training, validation, and testing. Of these, only 16 samples were used for
training, while the remaining 6 samples were used for validation and training. This
means that if, for examples, all five wheezing samples were drawn for validation and
testing, the neural network would have trained with the wheezing sounds at all! Future
work with the wavelet analysis in conjunction with neural network classification will need
the availability of additional respiratory signal files.
36
Chapter 4:
Conclusions & Future Work
Discussion of Results
Two methods were used to understand classification of various types of respiratory
signals—both normal and abnormal. Fourier analysis was used to visually inspect
normal sounds, wheezing sounds, and crackles. Application of the Fast Fourier
Transform over 50 ms time segments revealed the presence of wheezes in the
frequency domain, which has a major frequency component between 400 Hz and 600
Hz. Because crackles are characterized as discontinuities in the time domain with a
wide range of frequency components, the Fourier Analysis method was less useful for
pinpointing crackles.
Wavelet Analysis in conjunction with Artificial Neural Network Classification promised to
detect both wheezes and crackles simultaneously. These techniques were applied to
inspiration/expiration segments of the various sound files obtained from the R.A.L.E.
Repository. The initial results from this method have been disappointing, however, with
a classification accuracy ranging between 38% and 45.5%. One possible issue with the
neural network classification scheme is that too few samples were used for neural
network training.
37
Future Work
It may be possible to combine the success of the Fourier analysis method described
earlier to improve the neural network classification scheme. Instead of breaking the
R.A.L.E. repository sounds into inspiration/expiration segments, the sound files could be
broken into 50 ms intervals. This will yield approximately 200 segments of data for each
of the six 10-second R.A.L.E. repository sound files used.
The application of Fast Fourier transforms over 50 ms segments has already proved to
distinguish wheezes, crackles, and normal sounds when the data are visually inspected
in both the time domain and frequency domains. This fact can be exploited to preclassify the ~1200 50-ms segments of data, which can then be used to train the neural
network as earlier.
Another possible benefit of breaking the signals into 50-ms segments is that the scale is
better-suited for wavelet analysis. This is revealed by the fact that the time domain
plots for 50 ms intervals reveals smooth normal sound signals, high frequency wheezes,
and clearly discontinuous crackles. At a larger scale, these differences are
indistinguishable. The improvement in scale may improve the wavelet coefficient
calculation. One drawback to this method is that it will be a time-intensive, repetitive
process, because each of the ~1200 must be uniquely visually inspected. The
improvements in the classification scheme will, however, be worth the effort.
38
References
1. Earis, J.E. (2000). Current methods used for computerized respiratory sound
analysis. European Respiratory Review, 10, 586-590.
2. Kandaswamy, C. (2004). Neural Classification of Lung Sounds Using Wavelet
Coefficients. Computers in Biology and Medicine, 34, 523-537.
3. Beale, M.H., Hagan, M.T., Demuth, H.B. (2011.) MATLAB Neural Network
Toolbox User’s Guide.
http://www.mathworks.com/help/pdf_doc/nnet/nnet_ug.pdf
4. Misiti, M., Misiti, Y., Oppenheim, G., Poggi, J.M. (2012.) MATLAB Wavelet
Toolbox User’s Guide
http://www.mathworks.com/help/pdf_doc/wavelet/wavelet_ug.pdf
5. Moussavi, Zahra. (2007). Respiratory sound analysis: introduction for the
special issue. IEEE Engineering in Medicine and Biology Magazine, 0739, 15.
6. Pasterkamp, H., Kramen, S.S., & Wodicka, G. (1997). Respiratory sounds:
advances beyond the stethoscope. American Journal of Respiratory and Critical
Care Medicine, 156, 975-987.
7. Reichert, S., Gass, R., Brandt, C., & Andres, E. (2008). Analysis of respiratory
sounds: state of the art. Clinical Medicine: Circulatory, Respiratory, and
Pulmonary Medicine, 2, 45-58.
8. Sovijarvi, A.R.A., Vanderschoot, J., & Earis, J.E. (2000). Standardization of
computerized respiratory sound analysis. European Respiratory Review, 10, 585.
9. R.A.L.E. Repository of Respiratory Sounds. (2008.) http://www.rale.ca/
10. Hadjileontiadis, Leontios. (2009.) Lung Sounds: An Advanced Signal
Processing Perspective. Systhesis Lectures on Biomedical Engineering.
39
Download