Computerized Respiratory Sound Analysis: An Exploration of Methods Thesis by Urvi Patel CWRU School of Engineering, Department of Physics Advisor: Dr. Ronald Cechner Anesthesiology, University Hospitals CWRU Biomedical Engineering Department Wednesday, May 2, 2011 1 Abstract The presence and detection of abnormal breath sounds is an indicator of a variety of lung diseases, including pneumonia and chronic bronchitis. Traditionally, abnormal breath sounds are detected using stethoscopes and qualitative methods based on a physician’s own hearing. Computerized methods of respiratory sound analysis will provide a quantitative basis for abnormal respiratory sound detection. The overall goal of this project is to progress towards building a non-obstructive device that can continuously monitor respiratory signals in order to detect and accurately classify wheezes, crackles, and normal breath sounds. As a first step in this process, we analyzed pre-recorded, high-preclassified breath sounds from the R.A.L.E. Repository using two methods: (1) Fast Fourier Transforms and (2) Wavelet Transforms in conjunction with Artificial Neural Network Classification . The goal of this analysis has been to identify the key distinguishing features of the primary breath sounds: wheezes, crackles, and normal sounds and to determine the effectiveness of the two methods. The future direction of this project depends on developing good signal analysis techniques required to efficiently detect respiratory signals. 2 Contents ABSTRACT 2 CONTENTS 3 LIST OF FIGURES 4 PREFACE 5 CHAPTER 1: INTRODUCTION TO RESPIRATORY SOUND ANALYSIS 7 CHAPTER 2: FOURIER TRANSFORMS IN THE ANALYSIS OF RESPIRATORY SIGNALS 11 CHAPTER 3: WAVELET TRANSFORM ANALYSIS 20 CHAPTER 4: CONCLUSIONS & FUTURE WORK 37 CHAPTER 5: DISCUSSION OF RESULTS & FUTURE WORK 37 REFERENCES 39 3 List of Figures Figure 1: Fast Fourier Transform of a Normal Sound Example 1. ................................................ 15 Figure 2: Fast Fourier Transform of a Normal Sound, Example 2. ............................................... 16 Figure 3: Fast Fourier Transform of Wheezing, Example 1. .......................................................... 16 Figure 4: Fast Fourier Transform of Wheezing, Example 2. .......................................................... 16 Figure 5: Fast Fourier Transform of Crackles, Example 1. ............................................................ 16 Figure 6: Fast Fourier Transform of Crackles, Example 2. ............................................................ 16 Figure 7: The Daubechies family of wavelets from the 2nd order to the 10th order. ..................... 22 Figure 8: Decomposition tree of the Discrete Wavelet Transform. ............................................... 25 Figure 9: Illustration of an Artificial Neural Network Training Model ............................................. 26 Figure 10: Illustration of a Multi-level Artificial Neural Network. .................................................... 27 Figure 11: Confusion Matrices for one Trial Run of the Artificial Neural Network........................ 35 4 Preface The roots of this project dig back to the summer of 2009, after my sophomore year at Case Western Reserve University. That summer, two other students, Joe Karas and Stefan Maurer, and I built the first prototype of what we called an “Abnormal Respiratory Sound Detector” as a part of a summer engineering design experience sponsored by the CWRU Biomedical Engineering Department, the Case Alumni Association, and Case’s Rising Engineers and Technological Entrepreneurs (C.R.E.A.T.E.). Under the advisement of both Dr. Ronald Cechner and Dr. Dustin Tyler, we learned indispensable engineering techniques and rudimentary signal analysis to build a crude “proof-of-concept” device. We learned many lessons that year; most important, the lesson of plowing through the “inexperience roadblock” in favor of progress. Although we tried to overcome the challenges of our inexperience with signal analysis and design, we did not succeed in our efforts to reliably detect and classify abnormal breath sounds in real-time that summer. Two years later, I chose to revisit my effort to detect abnormal breath sounds. By then, I had a greater concept of digital systems design, but my understanding of signal analysis techniques has always been shaky. I wanted to focus on developing the heart of the prototype my team and I aimed to build in 2009: signal analysis, with an eye towards accurately classifying signals, and algorithm development—as opposed to prototype hardware design. Thus, my senior project was born. - Urvi Patel 5 Acknowledgements This project would not have been possible without the support of and collaboration with my original project team: Stefan Maurer and Joe Karas. The original project would not have been possible without the generous support by CWRU Biomedical Engineering Department, the Case Alumni Association, C.R.E.A.T.E. and Dr. Dustin Tyler. That said, during the resurrection of the original project, I have received unrelenting support from Dr. Rolfe Petschek and the CWRU Physics Department. Without their support, I would still be stuck understanding how to apply Fourier Transforms properly. Finally, I would like to thank Dr. Ronald Cechner, who has been my patient advisor and mentor since the birth of the original project and throughout the project’s reincarnation as my senior project. This project is his brainchild, and I hope that my work here will be a useful reference for anyone who chooses to contribute to this study. 6 Chapter 1: Introduction to Respiratory Sound Analysis History of Respiratory Signal Analysis The importance of listening to and understanding respiratory sounds is evident from the iconic and symbolic usage of the stethoscope in modern medicine. The stethoscope was invented in 1821 by the French Physician, Laennec, upon the discovery that respiratory sound analysis aids in the diagnosis of pulmonary infections and diseases, such as acute bronchitis and pneumonia [2, 9]. Since 1821, stethoscopes have become the most common diagnostic tool by doctors in the twenty-first century [7, 9]. Despite its widespread use, however, analysis of respiratory sounds using stethoscopes is rudimentary at best and requires a degree of subjectivity from the physician [6, 7, 9]. Analysis of respiratory sounds using stethoscopes depends on the variable factors of the diagnosing physician’s experiences, hearing, and ability to recognize and differentiate patterns [9]. In addition, stethoscope data is not typically recordable, making long-term correlation of data difficult [6, 9]. All of these factors reduce the value stethoscopes bring to a world that increasingly demands quantitative measures of disease. 7 Over the last four decades, researchers have made significant progress in fine-tuning computerized signal processing techniques, and it is now possible to perform respiratory sound analysis using many of these techniques [7]. Computerized analysis has succeeded in producing graphical representations of respiratory signals, which provide physicians additional methods of pulmonary diagnosis [7]. The use of computational power to analyze pulmonary spectral data has advanced respiratory sound analysis from a subjective skill to an objective one [8, 9]. In addition, the recent availability of cheap computer memory has enabled permanent storage of recorded respiratory sounds [8]. The next step in computerized analysis of respiratory signals is to automate the classification of respiratory sounds based on real-time data. 8 Types of Respiratory Sounds Respiratory signals can be classified into two major categories: normal lung sounds (NLS) and abnormal lung sounds (ALS) [10]. Most abnormal lung sounds are both adventitious and nonstationary. While many types of abnormal lung sounds exist, the two major categories of abnormal lung sounds are wheezes and crackles [10]. A wheeze is a continuous adventitious sound that is characteristically “musical” in nature [4]. Wheezing is usually caused by airway obstruction in the lungs [10]. The presence of wheezes during breathing can indicate asthma, cyctic fibrosis, and bronchitis in a patient [4, 10]. Wheezes are high-pitched in relation to normal breath sounds, and their frequency distribution is usually between the 400 Hz to 600 Hz range [10]. They typically last for longer than 100 ms [4]. Because wheezes have a defined frequency range, frequency domain analysis of a respiratory signal can reveal a wheeze. A crackle is a discontinuous adventitious sound that is characterized by sharp bursts of energy [4]. Their duration is typically shorter than 20 ms, and they are characterized by a wide distribution of frequencies [4]. Because of this wide frequency distribution, it is difficult to pinpoint crackles in the frequency domain. Crackles can be broken down into two additional categories. Fine crackles are high-pitched crackles that occur repeatedly over inspiration across multiple breathing cycles [10]. Coarse crackles are low-pitched sounds that appear early during inspiration or sometimes during expiration as a result of liquid filling small airways in the lungs [10]. The presence of crackles can indicate cardiorespiratory diseases, pneumonia, and chronic bronchitis [4]. 9 The Near Future of Respiratory Signal Analysis Over the last thirty years, various methods of computerized respiratory sound analysis have attempted to distinguish and classify abnormal lung sounds. These methods include the usage of Fast Fourier Transforms, Short Time Fourier Transforms, Wavelet Transforms, fuzzy logic classification, autoregressive modeling and neural network classification—among others. The study of computerized detection methods for abnormal lung sounds continues to grow as computer classification algorithms become more sophisticated over time. This project explores the detection of abnormal lung sounds in respiratory signals using the Fast Fourier Transform and the Wavelet Transform (in conjunction with neural network classification). 10 Chapter 2: Fourier Transforms in the Analysis of Respiratory Signals Introduction The Fourier Transform is known to be one of the most widely-used techniques in signal analysis. Applying a Fourier transform to an input signal yields a representation of the major frequency components of the signal. In respiratory signal analysis, the Fourier transform is particularly useful in revealing the presence of wheezes because wheezes occur in a known frequency band between 400 Hz and 600 Hz [10]. The Fourier transform is less useful for pinpointing crackles, because crackles have a wide frequency distribution. This chapter illustrates the use of Fourier analysis in the detection of both wheezes and crackles based on our experimental data. 11 Overview of Fourier Transforms Understanding the Fourier Transform is an important part of understanding why they are useful in signal analysis—especially in the analysis of sinusoid-like signals such as wheezes that occur in a narrow band of frequencies. The basis of the Fourier Transform is the sine wave [4]. Mathematically, the Continuous Fourier Transform is described as [4]: ( ) =∫ ( ) (1) In Equation 1, ( ) is the input signal [4]. In respiratory signal analysis, this input signal is usually a time domain representation of the signal: amplitude of the sound as a function of time [4]. A microphone recording of a respiratory signal is a time domain representation. Referring back to Equation 1, the input signal is then multiplied by a complex exponential [4]. Recall that, according to Euler’s formula, the complex exponential can be broken into real and complex sinusoidal components [4]: = cos( )+ ( ) (2) In Fourier Analysis, integrating the multiplication of the input signal to the complex exponential over all time is similar to finding an “inner product” of two vectors: The equation calculates a set of coefficients that describe the “similarity” of the input signal to the complex exponential [4]. In other words, the Fourier coefficients, ( ) , store how similar the input signal is to a series of sinusoids with frequency 12 [4]. Equation 1 describes the Continuous Fourier Transform [4]. Several other variations of the Continuous Fourier Transform exist, including Short Time Fourier Transform and Discrete Fourier Transforms [4]. The general concept of applying these variations to input signals is the same as described above [4]. The Short Time Fourier Transform adds a “windowing” mechanism that allows Fourier analysis over shorter segments of time [4]. The Discrete Fourier Transforms is useful for computerized calculations, where the integrals must be replaced by numerical summations [4]. An extremely efficient algorithm that calculates the Discrete Fourier Transform, known as the Fast Fourier Transform, can be used for computerized Fourier analysis of respiratory signals [4]. 13 Methods A set of pre-recorded, pre-classified respiratory signals were obtained from the R.A.L.E. Repository [9]. The R.A.L.E. Repository hosts a set of well-documented, 10-second respiratory signals [9]. The set of signals includes three normal sound files, two crackle sound files, and two wheeze sound files. The signals are pre-filtered with an analog low-pass filter at 2500 Hz and were sampled at 11025 samples/second. Using MATLAB, Fast Fourier Transform was applied to every 50-ms intervals of the input signal for each of the sample signals obtained from the R.A.L.E. Repository. Time domain and frequency domain plots were generated in for each of these 50-ms segments to visualize the data. 14 Results Some example plots resulting from applying the Fast Fourier Transform to various types of respiratory signals are shown in this section. The plots reveal both time domain and frequency domain data across 50 ms of data. Approximately 200 plots were produced, each representing from performing a Fast Fourier Transform over 50 ms of the 10 seconds of data available for each sound file. Figure 1: Fast Fourier Transform of a Normal Sound Example 1. The signal in the time domain is very smooth. The frequency domain representation reveals that most frequencies are below 100 Hz. 15 Figure 2: Fast Fourier Transform of a Normal Sound, Example 2. As in the earlier example, the time domain is smooth, and the frequency domain consists mostly of frequencies below 100 Hz. 16 Figure 3: Fast Fourier Transform of Wheezing, Example 1. Like the normal sound, the time domain of the signal is devoid of discontinuities; however, the frequency of the signal has increased. The frequency domain plot reveals major frequency components around 400 Hz. This pitch matches the description for a wheeze sound, defined earlier. Figure 4: Fast Fourier Transform of Wheezing, Example 2. As in Figure 3, the frequency domain plot reveals a major frequency component around 400 Hz. 17 Figure 5: Fast Fourier Transform of Crackles, Example 1. Unlike wheezes and normal sounds, crackles reveal sharp discontinuities in the time domain. The frequency domain plot reveals a wide range of major frequency components. This description matches crackle characteristics. Figure 5: Fast Fourier Transform of Crackles, Example 2. Again, crackles reveal sharp discontinuities in the time domain. The frequency domain plot reveals a wide range of major frequency components. This description again matches crackle characteristics. 18 Based on visual inspection of the plots in this section, crackles, wheezes, and normal sounds were successfully identified. Wheezes were revealed in the frequency domain as sharp peaks around 400 Hz. While Fourier analysis was unable to pinpoint crackles, crackles were visually classified by noting the discontinuities in the time domain signal and the wide frequency distribution in the frequency domain. 19 Chapter 3: Wavelet Transforms in the Analysis of Respiratory Signals & Neural Network Classification Introduction An increasing number of studies are now exploring respiratory sound analysis using Wavelet Transforms. Unlike Fourier Transforms, Wavelet Transforms are able to detect both sharp discontinuities in a signal, such as crackles. As an added benefit, Wavelet Transforms can also detect gradual, sinusoid-like characteristics of a signal, such as wheezes. Unlike the output of Fourier Transforms, however, the output data from Wavelet Analysis is difficult to visualize. Once a Wavelet transform is applied to a signal, the output of the wavelet transform (namely, the wavelet coefficients) may be analyzed. This analysis can be combined with various pattern recognition schemes, such as Artificial Neural Network Classification, to classify the input signals. The following sections provide the details of the application of Wavelet Transforms, in conjunction with Artificial Neural Network classification, to analyze respiratory signals. 20 Overview of Wavelet Transforms Like the Fourier Transform, the Wavelet Transform compares an analyzing function against an input signal [4]. Continuing the earlier discussion, the mathematical representation of a 1D Continuation Wavelet Transform is [4]: ( , )= | | ∫ ( ) ∗ (3) As I noted earlier, the analyzing function for a Fourier transform is the sine wave [4]. During Fourier analysis, the input signal is broken into a series of sinusoids of various frequencies [4]. Similarly, the Wavelet Transform’s analyzing function is a wavelet, represented by by in Equation 3 [4]. See Figure 7 later in this section for a set of examples wavelets [4]. The input signal, denoted by x(t) in Equation 3, is compared to scaled and shifted versions of the analyzing wavelet [4]. In respiratory signal analysis, the input signal is typically a time domain representation of the signal [4]. As with Fourier analysis, performing Wavelet analysis is similar to calculating the “inner product” of the input signal and scaled and shifted versions of the wavelet [4]. The measure of “similarity” obtained from this calculation is stored in a set of Wavelet Transform Coefficients (CWT) [4]. One of the main differences between the Fourier Transform and the Wavelet Transform is that the Wavelet Transform takes into account the “scale” of the analyzing wavelet [4]. The scale is denoted by s in Equation 3 [4]. During Wavelet Analysis, Wavelet Transform Coefficients are calculated by comparing the input signal to scaled and 21 shifted versions of the analyzing wavelet [4]. “Scaling” the wavelet involves stretching or compressing the wavelet [4]. “Shifting” the wavelet involves moving the wavelet along the time axis of the input signal [4]. By scaling and shifting the wavelet across the input signal, the input signal is compared to a variety of wavelet characteristics [4]. Unlike sinusoids, wavelets are irregular and asymmetrical [4]. Many different types of wavelets have been developed for Wavelet Analysis [4]. Figure 7 below illustrates just one family of wavelets: the Daubechies family. Applications involving wavelet transforms may even involve custom-designed wavelets [4]. Figure 3: The Daubechies family of wavelets from the 2nd order to the 10th order. In respiratory signal analysis, wavelet analysis with 8th order Daubechies wavelets have produced good results (see the “Review of Literature” section). [Souce: 4] The selection of which wavelet to use depends on the input signal [4]. If the input signal contains many discontinuities, as in a crackle sound, the analyzing wavelet that will best represent the input signal will be sharp [4]. If the input signal is smooth, such as a wheeze, a smooth wavelet may be chosen containing few discontinuities [4]. 22 A benefit of scaling the wavelet during the calculation is that scaling will account for the frequency distribution of the input signal [4]. Stretched wavelets will compare better to sinusoid-like, slowly-varying signals like wheezes, while compressed wavelets will compare better to sharply-varying signals like crackles [4]. This implies that the timescale analysis of the input signal employed by the Wavelet Transform automatically corrects for the frequency distribution [4]. Applying the Continuous Wavelet Transform progresses in the following manner [4]: 1. The analyzing wavelet is compared to the beginning of the input signal, x(t). 2. The Wavelet Transform Coefficients are calculated based on the similarity of the analyzing wavelet to the input signal. A wavelet coefficient of value “0” corresponds to “zero similarity.” The wavelet coefficient cannot exceed a value of 1. 3. Shift the wavelet along the time axis to the next section of the input signal. Repeat Step 2. Continue shifting and calculating wavelet coefficients until the end of the input signal. 4. Scale the original wavelet by compressing it or stretching it. Repeat Steps 1-3 until the wavelet has been scaled by all scale values. Equation 3 describes the 1D Continuous Wavelet Transform [4]. Given a maximum scale value, s, the Continuous Wavelet Transform will calculate Wavelet Transform Coefficients for every scale value between 1 and s [4]. Additionally, the Continuous Wavelet Transform will shift smoothly over the entire time axis of the input signal [4]. 23 This implies that calculating Wavelet Transform Coefficients using Continuous Wavelet Transforms is a computationally-intensive process [4]. An alternative to the Continuous Wavelet Transform is the Discrete Wavelet Transform, which uses a subset of scales and shifts to calculate the Wavelet Transform Coefficients [4]. An efficient algorithm used to calculate wavelet coefficients from the Discrete Wavelet Transform is the Mallat algorithm [4]. The Mallat algorithm shortcuts the need to scale the wavelet at every possible value from 0 to s, as discussed previously [4]. This eliminates the need for Step 4 from the earlier discussion. The algorithm does this by “decomposing” the input signal by passing it through a high pass filter and a low pass filter [4]. The output from the high pass filter reveals the high-frequency, low-scale Details of the original signal [4]. The output from the low pass filter is the low-frequency, high-scale Approximation of the original signal [4]. Because scaling accounts for the frequency distribution (and vice versa), the algorithm bypasses “scaling” the wavelet in this manner. Wavelet Transform Coefficients are calculated for both the Detail and Approximation signals [4]. The decomposition process is then reapplied to the Approximation signal for a given number of decomposition levels [4]. Figure 8 provides a general overview of the method. 24 Figure 4: Decomposition tree of the Discrete Wavelet Transform. The original signal, S, undergoes a high-pass filter and a low-pass filter. The output of the low-pass filter is the Approximation of the signal, producing wavelet coefficients cA1, cA2, and cA3. The next set of filters is applied to the Approximation signals. The output of the high-pass filters are the Details of the signals, producing the wavelet coefficients cD1, cD2, and cD3. [Source: 4] 25 Overview of Artificial Neural Network Classification Artificial Neural Networks can be employed to perform complex pattern recognition tasks [3]. The neural network consists of a set of elements called neurons that operate in parallel on a set of inputs to produce a set of outputs [3]. The outputs the neural network are compared to a set of “target” values. Once a comparison of the outputs and the targets has been performed, the neural network is adjusts a set of weights and biases so that the neural network output will better predict the target data [3]. Figure 9 illustrates this process, which is known as the training stage of neural network design. Figure 5: Illustration of an Artificial Neural Network Training. An input signal is fed into the neural network, which produces a set of outputs based on the calculation of transfer functions inside the neural network model. The outputs from the neural network are compared against a set of “target” values. The neural network is adjusted to better match these target values. [Source: 3] After a neural network has been trained, the neural network can validate and test the network using pre-classified inputs to evaluate the network’s final performance [3]. 26 Figure 10 illustrates a multi-level neural network. The input data is fed into a “neuron” that multiplies the input by a weight and adds it to a bias [3]. This value is passed into a transfer function to produce a set of outputs [3]. In a multi-level neural network, the outputs are then fed into another layer of neurons [3]. The final set of outputs will be compared to the “target” dataset during training [3]. Figure 6: Illustration of a Multi-level Artificial Neural Network. The input signals p1-x are multiplied by a set of weights, w, and added to a set of biases, b. These values are then sent into the transfer function. The output of the transfer function are applied to the second layer of neurons. The outputs from the third and final layer of neurons will be compared against the “target” values, as shown in Figure 9. [Source: 3] 27 In summary, neural network design consists of five stages [3]: 1. Collect Data: “Input” data is gathered and “Target” data is generated for a set of sample inputs. 2. Create and configure the initial network. 3. Initialize the weights and biases. 4. Train and validate the network using the sample “input” data and “targets.” The neural network algorithm will adjust the weights and biases to product outputs that will better match the “target” data. Various neural network training algorithms exist, but an analysis of the various methods is beyond the scope of this project. 5. Validate and test the network. Evaluate its performance. Retrain the network if the evaluation is poor, but this may not fix the issue. 6. Use the network on new, unclassified data. Tools such as MATLAB’s Neural Network Toolbox can automatically perform Steps 2, 3, and 4 above based on a set of “input” and “target” datasets. 28 Review of Literature Kandaswamy, et al. applied a 7-level Discrete Wavelet Transforms on a set of prerecorded respiratory sounds, broken into inspiration/expiration cycles [2]. They proceeded to extract a set of four statistical features from the wavelet coefficients [2]: 1. Mean value of the wavelet coefficients 2. Average power of the wavelet coefficients 3. Standard deviation of the wavelet coefficients 4. Ratio of the mean values of the wavelet coefficients in adjacent decomposition levels The first two of these statistical features represent the frequency distribution of the signals [2]. The last two statistical features represent the frequency variation of the signals [2]. Kandaswamy fed these statistical features into a multi-layer artificial neural network to successfully classify the respiratory signals into six categories: normal, wheeze, crackle, squawk, stridor, or rhonchus [2]. By experimenting with various analyzing wavelets neural network training algorithms, they compiled the following set of recommendations for respiratory signals analysis using wavelet analysis in conjunction with neural network classification: They determined that the 8th-order Debauchies wavelet was yielded the best results during wavelet analysis. They also determined that a multi-level neural network with 40 hidden neurons using a tan-sigmoid transfer function for the first layer and a log-sigmoid 29 transfer function for the second layer was the optimal neural network architecture to use [2]. They trained their neural network using the resilient backpropagation training algorithm [2]. Various other groups have used Kandaswamy’s methods and results to train neural networks to classify respiratory signals. Hashemi, et al. used similar methods to successfully classify wheeze sounds with 89.28% accuracy. Hashemi’s team used extracted two additional statistical features from the wavelet coefficients: (1) the skewness of each wavelet decompositions and (2) the kurtosis—or the “degree of peakedness”—of the wavelet decompositions. 30 Methods Once again, the analysis was performed on a set of five pre-recorded, pre-classified respiratory sounds. This time, each of the 10-second sound samples was broken into segments of inspiration/expiration breathing cycles. This breakdown yielded a total of 22 breathing cycles, including 8 normal breathing cycles, 5 wheeze breathing cycles, and 9 crackle breathing cycles. A 1D Discrete Wavelet Transform was applied to each of the 22 breathing cycle samples using MATLAB’s Wavelet Toolbox. Using the recommendations from literature, the 8th-order Daubechies family wavelet using and 7 levels of decomposition was chosen for analysis [2]. A Discrete Wavelet Transform was applied to each of the 22 breathing cycles. The application yielded a set of wavelet coefficients for each of the seven decompositions of the input breathing cycle. These wavelet coefficients were saved in a set of 22 MATLAB vectors that corresponded to each of the 22 breathing cycles analyzed. Three statistical characteristics were calculated from these 22 vectors: 1. The mean of the wavelet coefficients for each vector 2. The average power of the wavelet coefficients for each vector 3. The standard deviation of the wavelet coefficients for each vector The mean of the wavelet coefficients (feature 1) and the average power of the wavelet coefficients (feature 2) correspond to the frequency distribution of the breathing cycle 31 [2]. The standard deviation of the wavelet coefficients (feature 3) measures frequency variation of the breathing cycle [2]. These statistical features for each breathing cycle sample were combined in a 22x3 matrix. The 22 rows of this matrix correspond to the 22 breathing cycles. The 3 columns correspond to each of the three statistical features extracted from the wavelet coefficients. This 22x3 matrix serves as the “input” matrix for an Artificial Neural Network optimized for pattern recognition. The first 8 rows of this matrix correspond to statistical data from the normal breathing cycles. The next 5 rows of this matrix correspond to statistical data from the wheeze breathing cycles. The final 9 rows of this matrix correspond to the statistical data from the crackle breathing cycles. The structure of this “input” matrix is important because it must match up to the classification in the “target” matrix, as explained in the next few paragraphs. Sample 1 Mean of CWT Sample 1 Average Power of CWT Sample 1 Standard Deviation of CWT Sample 2 Mean of CWT Sample 2 Average Power of CWT Sample 2 Standard Deviation of CWT … … … Sample 22 mean of CWT Sample 22 Average Power of CWT Sample 22 Standard Deviation of CWT Table 1 Organization of the 22x2 “input” matrix into the artificial neural network. The matrix consists of 22 rows, one row for each sample of breath cycle samples. The matrix consists of three columns that store the mean of the Wavelet Transform Cofficients (CWT), the Average Power of the Wavelet Transform Coefficients, and the Standard Deviation of the Wavelet Transform Coefficients. 32 The “target” matrix is also a 22x3 matrix. The matrix contains 22 rows to store data on the 22 breathing cycles analyzed. Unlike the “input” matrix, however, the columns of the “target” matrix refer to the classification of the breathing cycle. A “1” in the first column indicates that the input signal is a normal breathing cycle. A “1” in the second column indicates that the input signal is wheeze breathing cycle. Finally, a “1” in the third column indicates that the input is a crackle sound. 1 0 0 0 0 1 0 1 0 0 0 1 Table 2: 4x3 example target matrix For example, the 4x3 target matrix, target = [1, 0, 0; 0, 0, 1; 0, 1, 0; 0, 0, 1] (see Table 2 above) means that that the first input is classified as a normal breathing cycle, the second and fourth inputs are classified as crackle breathing cycles, and the third input is classified as a wheeze breathing cycle. A 22x3 “target” matrix was constructed to correspond to the “input” matrix described earlier. The first 8 rows of the matrix were classified as “normal” with a “1” in the first column of the matrix. The next 5 rows of the matrix were classified as “wheezing” with a 33 “1” in the second column of the matrix. The remaining 9 rows of the matrix were classified as “crackles” with a “1” in the last column of the matrix. The “input” and “target” matrices serve as inputs into an Artificial Neural Network, which uses these matrices to for training, validation, and testing. The MATLAB Neural Network Toolbox GUI was used to configure an Artificial Neural Network optimized for pattern recognition. Pattern recognition was performed using a gradient backprojection algorithm and a multilayer Artificial Neural Network containing 21 hidden neurons. The neural network was trained using 70% of the 22 breathing cycle samples (15 samples total). Validation of the neural network was performed using 15% of the samples (3 samples total). Another 15% of the samples (3 samples total) were used to test the neural network accuracy. After each stage (training, validation, and testing), a “Confusion matrix” was generated to evaluate the performance of the neural network (see the Results section for an example). Misclassification statistics are marked with the color red on the matrix [3]. Accurate classifications are marked with the color green on the matrix [3]. The overall performance of the matrix is marked in a blue box at the bottom right corner of the matrix [3]. The same neural network configuration was trained several times. For each run, a Confusion matrix was generated. Accuracy and Misclassification statistics for each run were compared against each other to determine the overall accuracy of classification using the neural network scheme. 34 Results An example of the generated “Confusion matrix” evaluating the Neural network classification scheme is outlined below. None of the classification schemes resulted in classification accuracy greater than 50%. The results consistently resulted in a classification accuracy between 38% and 45%. Figure 7: Confusion Matrices for one Trial Run of the Artificial Neural Network. A matrix is generated for each of the three stages of neural network training: training, validation, and testing. The red boxes indicate percentage and number of samples that were misclassified at each stage. The green boxes indicate the percentage and number of samples that were classified accurately at each stage. The blue box evaluates the neural network’s overall performance at each stage. The overall accuracy of the neural network classification scheme designed was 45.5% for all three stages. 35 Based on these results, the neural networked configured in the previous section was unable to reliably classify the input signals. One possible issue with the artificial neural network classification scheme could be that an insufficient number of samples were used for neural network training. Only 22 samples (8 normal breathing cycles, 5 wheeze breathing samples, and 9 crackle breathing cycles) were available for neural network training, validation, and testing. Of these, only 16 samples were used for training, while the remaining 6 samples were used for validation and training. This means that if, for examples, all five wheezing samples were drawn for validation and testing, the neural network would have trained with the wheezing sounds at all! Future work with the wavelet analysis in conjunction with neural network classification will need the availability of additional respiratory signal files. 36 Chapter 4: Conclusions & Future Work Discussion of Results Two methods were used to understand classification of various types of respiratory signals—both normal and abnormal. Fourier analysis was used to visually inspect normal sounds, wheezing sounds, and crackles. Application of the Fast Fourier Transform over 50 ms time segments revealed the presence of wheezes in the frequency domain, which has a major frequency component between 400 Hz and 600 Hz. Because crackles are characterized as discontinuities in the time domain with a wide range of frequency components, the Fourier Analysis method was less useful for pinpointing crackles. Wavelet Analysis in conjunction with Artificial Neural Network Classification promised to detect both wheezes and crackles simultaneously. These techniques were applied to inspiration/expiration segments of the various sound files obtained from the R.A.L.E. Repository. The initial results from this method have been disappointing, however, with a classification accuracy ranging between 38% and 45.5%. One possible issue with the neural network classification scheme is that too few samples were used for neural network training. 37 Future Work It may be possible to combine the success of the Fourier analysis method described earlier to improve the neural network classification scheme. Instead of breaking the R.A.L.E. repository sounds into inspiration/expiration segments, the sound files could be broken into 50 ms intervals. This will yield approximately 200 segments of data for each of the six 10-second R.A.L.E. repository sound files used. The application of Fast Fourier transforms over 50 ms segments has already proved to distinguish wheezes, crackles, and normal sounds when the data are visually inspected in both the time domain and frequency domains. This fact can be exploited to preclassify the ~1200 50-ms segments of data, which can then be used to train the neural network as earlier. Another possible benefit of breaking the signals into 50-ms segments is that the scale is better-suited for wavelet analysis. This is revealed by the fact that the time domain plots for 50 ms intervals reveals smooth normal sound signals, high frequency wheezes, and clearly discontinuous crackles. At a larger scale, these differences are indistinguishable. The improvement in scale may improve the wavelet coefficient calculation. One drawback to this method is that it will be a time-intensive, repetitive process, because each of the ~1200 must be uniquely visually inspected. The improvements in the classification scheme will, however, be worth the effort. 38 References 1. Earis, J.E. (2000). Current methods used for computerized respiratory sound analysis. European Respiratory Review, 10, 586-590. 2. Kandaswamy, C. (2004). Neural Classification of Lung Sounds Using Wavelet Coefficients. Computers in Biology and Medicine, 34, 523-537. 3. Beale, M.H., Hagan, M.T., Demuth, H.B. (2011.) MATLAB Neural Network Toolbox User’s Guide. http://www.mathworks.com/help/pdf_doc/nnet/nnet_ug.pdf 4. Misiti, M., Misiti, Y., Oppenheim, G., Poggi, J.M. (2012.) MATLAB Wavelet Toolbox User’s Guide http://www.mathworks.com/help/pdf_doc/wavelet/wavelet_ug.pdf 5. Moussavi, Zahra. (2007). Respiratory sound analysis: introduction for the special issue. IEEE Engineering in Medicine and Biology Magazine, 0739, 15. 6. Pasterkamp, H., Kramen, S.S., & Wodicka, G. (1997). Respiratory sounds: advances beyond the stethoscope. American Journal of Respiratory and Critical Care Medicine, 156, 975-987. 7. Reichert, S., Gass, R., Brandt, C., & Andres, E. (2008). Analysis of respiratory sounds: state of the art. Clinical Medicine: Circulatory, Respiratory, and Pulmonary Medicine, 2, 45-58. 8. Sovijarvi, A.R.A., Vanderschoot, J., & Earis, J.E. (2000). Standardization of computerized respiratory sound analysis. European Respiratory Review, 10, 585. 9. R.A.L.E. Repository of Respiratory Sounds. (2008.) http://www.rale.ca/ 10. Hadjileontiadis, Leontios. (2009.) Lung Sounds: An Advanced Signal Processing Perspective. Systhesis Lectures on Biomedical Engineering. 39