1 Automated adaptive heart beat interval extraction from long term noisy and variable ECG signals. Jorge Torres Abstract— There are several situations in which a biophysical measurement can become contaminated with noise. When this noise is severe, there is a need for algorithms to extract the important patterns from the signal in question. In this work we discuss the development of an automated adaptive method to obtain the time between heart beats from a noisy and highly variable electrocardiography signal. The method proposed gave good results when compared with manual extraction (86.6% of coincidence for beat detection, 7.76% of difference between detected means), and can be used if the limitations of the algorithm are taken in account. Further research should be conducted to test and improve the robustness of the algorithm. Index Terms— Adaptive estimation, electrocardiography, nonlinear dynamics, signal processing. I. INTRODUCTION T HERE are many cases where biophysical measurements become contaminated with noise, and sometimes these measurements possess important information about an experiment or a patient. In recent years, the advent of portable biomedical devices has increased our capacity to obtain longterm measurements of biophysical parameters away from the hospital, such as Holter heart monitors for instance. In these long term recordings, the device transducers may move away from their original positions, or exhibit conductivity problems as the patient walks away from the hospital, thereby, introducing noise to the measurements. These measurements are important as they can contain important information for patient diagnosis or for research in the measurement and data processing fields ([1], [2]). It would be extremely time consuming to repeat measurements every time we encountered such noise contaminations. In these situations, the data can be manually categorized and analyzed by an expert in the field, but it is again time consuming, and for large sets of data this could represent an insurmountable quantity of work. Therefore, we propose to create an automated pattern extraction algorithm, capable of extracting important features from noisy ECG data. Some previous work has shown good results working on digital, two-state signals [1]. This motivated us to create an algorithm to extract quasi-periodic features from analog signals. In this case, we were interested in extracting the time between heart beats from the noisy signal. The extraction of this parameter is important to fields like nonlinear dynamics [2], [3] where measurements of quasi-periodic signals (as is the heart beat) show some long-range correlations or long term memory for young and healthy individuals. II. PROBLEM STATEMENT A. The nature of the data analyzed The data that are being analyzed come from electrocardiography measurements, and for the aims of this paper, we are trying to extract the quasi-periodic temporal content of these signals, so we can study the nonlinear dynamics of the data. The signal is said to be quasi-periodic because there are long term natural fluctuations between heartbeats [3], but the frequency of the heart beats can change with the activity of the subject. In this particular case, a Holter heart monitor was used to obtain the data [4]. This is a device intended for ambulatory electrocardiography, and will record heart data during normal activity for a long period of time (normally a full day). This is accomplished by attaching electrodes to the chest of the patient, and the recording device is carried along with him. The specific device used to obtain the data had a sampling frequency of 128 samples per second. Now, about the noise in the signal, there are several sources. One of the most important sources of noise is the poor electrical conduction due to movement of the electrodes. However, we also have intrinsic electrical noise from the device components, noise induced by the body, first by its own electrical currents, and also by working as an antenna for electrical radiation (as electric power frequency and radio signals to a minor extent). So the noise captured by the device, especially when the conductivity of the signal is poor, is a combination of different sources. Therefore, the noise added to the signal is considered to have a Gaussian distribution, as it is known that when different sources of noise are added, the resulting noise tends to Gaussian as the number of noise sources increase [5]. Some parts of the algorithm created in this work are based in this assumption. Then the resulting ideal signal expected from a Holter is a sequence of pulses in time, which should correspond to each heart beat. Graphs from clean and corrupted Holter measurements are shown in figure 1. As we can see, in the case of the clean signal the event of a heart beat can be easily distinguished from the peaks of the signal, that can be detected by a rejection threshold, but in the noisy case this technique would lead to obtain many errors because of the distorted nature of the signal, also it can be seen that low frequency components of the noise are driving the DC component of the signal up and down randomly. Even though that the pulses can still be distinguished by careful observation, this could be a long task as it implies several hours of data to characterize. 2 Where p is the index of the data where the last beat was found. Then, differences ‘Y’ around the mean were found in the window, this in order to find the maximum difference. Y (i ) X (i ) X Finally, to counteract the effects due to rising or falling trends in the signal, we calculated the correspondent linear regression values of these differences, denoted by ‘YR’, and we obtained a set of “non trended” differences ‘YN’ by subtracting the regression values from the trended data. YN (i) Y (i) YR (i) Fig. 1 Examples of clean timing signal (top) and noisy signal (bottom) III. PROPOSED METHOD Simple filtering would not be appropriate to solve this problem because of many reasons. First, as this is a long term signal obtained from electrocardiography, the heart beat period is inherently changing with activity of the individual (and therefore the frequency), and if we filtered the signal with fixed filters, the filtered signal could loose part of the information and also arrhythmia events which could be important for the study in course. Also, it has been shown previously that the noise in this case has components of low frequency, and the signal we are trying to extract is of low frequency, so probably a filter would capture a good amount of noise. Then, a different approach had to be used, and should be adaptive to account for changes in the signal. As the peaks are still noticeable in most of the cases in the noisy signal, we thought on working in the time domain, trying to simulate what a human operator would do to find the pulses. First, we know that heart beat time intervals have normally characteristic values, as stated by the Merck manual of medical information, the normal heart beat rate for an adult at rest ranges from 60 to 100 heart beats per minute, with slightly lower rates for physically fit persons [6]. Then we know that in normal situations the heart beat rate is quasi-periodic, so we can predict the next heart beat from previous ones. We know there could be some situations that can cause localized fluctuations (arrhythmias), but normally these fluctuations won’t appear often unless the subject is experiencing some heart disease. This knowledge is taken in consideration when a human looks at the graph, so he will start to look for periodic patterns of peaks rising from the distorted signal, and looking for these peaks with time separations around the common values of time for normal heart beat rates. From here, we know the lowest time period to expect for finding another pulse from the previous one, so the first thing that we did was extracting windows ‘X’ from the data collection ‘D’, where at least one heart beat should appear, so we decided to use a window of size N samples (for this work we used N=170, corresponding to a minimum of approximately 45 heart beats per minute, but most of the parameters are easily adjustable in the code, written in Matlab), so the window is defined as: X (i ) D( p i ) ; i 1, 2,..., N Then for the purpose of finding a pulse within a window, we decided to give weights for the elements in the window, giving more weight to elements that could potentially posses the next expected pulse, i.e. samples separated milliseconds or far away from the previous beat would not be likely another beat, but spurious repetitions of the previous beat or probably the second next beat, so low weight was given to these zones, and samples separated a second from the previous beat have good chances to be the next beat, so more weight was given then to this zone. In fact, we designed three weighting factors of the same size N of the window selected, reflecting probabilistic models of the signal that was being studied. The first one was called the discovery weighting factor, and the other two were called beat seeking weight factors. This is because the first one is used in stages where it seems that we have lost the beats to start our next window (so we need to discover a possible beat), and the others are meant for the process of finding probable beats based on a previous discovery stage. The first (discovery) weighting factor gives decreasing weights in a negative slope shape, so more weight is given to elements of the factor window that are nearer to the beginning, this is because we want to find the first possible available pulse (as we are in the discovery stage). The second (truncated trapezoid) and third (Gaussian) weighting factor shapes are selected so no weight is given to elements of the windows that are right next of the previous pulse, in order to completely reject spurious repetitions of it. More weight is given to the region where a normal heart beat should appear, and the weight is decreased on zones where another beat is less likely to appear by experience. For the Gaussian weighting factor more weight is going to be given to the values of the window around its mean. The standard deviation is chosen in order that the points where the distribution has half of the maximum value, correspond to the range of 40 heart beats per minute (100 - 60). This technique is similar to the 3 dB rejection zone for filter design. The mean is variable, depending on a learning adaptive algorithm. The shapes of the weighting factors are shown in figure 2. Then we calculated the weighted differences ‘YW’ from the non trended values obtained before, multiplying each value of ‘YN’ for the corresponding weight on the weighting factor chosen ‘WX’. The process of choosing a weighting factor will be explained later. Figure 3 illustrates the effect of weighting. YW (i) YN (i ) WX (i ) 3 Fig 2. First (top), second (middle) and third (bottom) weighting kernels The location of the selected beat in the window was given by the maximum value on the weighted result. As we can see in figure 3, the weighting process gives more probability to be chosen to pulses that are around the expected values. The adaptive process is one of the most important parts of the algorithm, as it is the core in which the efficiency of the pulse selection is based. For this purpose a finite stack was created to store past heart beat interval values, this is the memory used for the decision making process. In this stack, when a new element is added, the least recent entry is discarded. The stack is initialized with zeros in all the locations, and zero is accounted for an “invalid entry” in the stack, meaning that the particular interval detected in that location was not reliable and therefore was rejected by some filters. This stack is the base of a “learning” process, it accepts a wide range of inputs when it has got many zeros in it (low knowledge), but as it starts learning (capturing non zero entries), becomes more selective for the new values to be accepted in. When there are many nonzero values in the stack, the knowledge of the system is told to be reliable. On the other hand, when the values that want to enter to the stack are continuously rejected (giving entry to a certain amount of zeros to the algorithm, which triggers a predefined state), then the stack knows that probably the data analyzed have changed, and switches to a “relearning” process, at which it will become less selective to the values that are allowed to get into it, in order to gain new knowledge of the information that appears to have changed. The learning state of the stack is defined by the amount of zeros in it, and so it is the weighting window selected at each specific evaluation stage. At the beginning, the stack is filled with zeros, because the algorithm has no valid knowledge of the data to be evaluated. Then the look for pulses begins, the first (discovery) kernel has a shape intended for capturing high peaks, and give preferences to peaks near to the origin. When a probable beat is found, a hypothesis test is done, taking 30 samples from each side of the pulse, and the pulse itself as a population; the hypothesis is that this pulse value is part of the population (the surrounding noise), that is modeled by a Gaussian distribution, because of the assumption explained in the problem statement. If the hypothesis is rejected, then the beat found is accepted (if the pulse has low probability to be part of the surrounding noise, then it is an outstanding peak), we record that the pulse was accepted (or we continue the search until we find an acceptable pulse), then we start to look for more pulses in this fashion, the second pulse is evaluated, and this time, if the pulse is accepted, we see it this falls within reasonable separation from the previous supposed heart beat (basically not very near), if this is accomplished, we store a nonzero value in the stack (the distance between betas found), if not we continue the search in this fashion until we start finding a decent pulse succession (indicated by the amount of nonzero values on the stack) if a relatively small number of nonzero values are found in the most recent values of the stack, and the variance calculated from the stack (not taking in account the zeros) is below a threshold, we start using the second weighting kernel. Any beat found is recorded in a vector as the output of the algorithm. Also, a flag system is used, for each beat recorded, a flag is stored. If the flag is 0, indicates that the beat found was not rejected by the discrimination process of the specific knowledge level, if the flag has o nonzero value, the pulse was rejected somehow, and the value of the flag indicates the criterion of rejection. Fig 3. Example of weighting effect. Top: non trended differences, Middle: weighting kernel, bottom: weighted result In the second weighting kernel, as can be seen in figure 2, we are giving more weight to pulses that fall within the expected values for normal heart beats. In this stage we also check for the integrity of the pulse using the expected value obtained from our stack (Explained later). Here to see if the pulse fell on expected values for normal heart beats, we make a hypothesis testing, to see if the heart beat interval found in this window falls within a normal distribution which mean is the expected value obtained from the elements in the stack, in this fashion we accept or reject beat intervals in the stack. Again, if the number of non zero elements in the stack raised to a certain threshold, and the variance for the elements on the stack was below a limit we started to use the third weighting kernel (Gaussian shaped), being the mean for this kernel the expected value calculated from the stack. This is done because a large number of nonzero values indicates a good confidence for the algorithm, as the rejection method used at each level insures that we are refining the search to more “intelligent” extraction as the stack learns from the data that is being processed. On the other hand, this specificity 4 gives raise to more selectivity for the values admitted in the stack, so if the values are being rejected (zeros in the stack), it can be an indicative that hard noise is invading that particular region of the signal, leading to a reduction on confidence level on the stack, switching to a less specific weighting kernel. In cases where the signal becomes very distorted, this is good, as the Gaussian rejects a lot of values by the weighting process, so this relearning stage insures that we are not “finding what we are expecting to find” by only using the Gaussian kernel. Now, for estimating the expected value from the memory in the stack, we had three cases based again in the learning level of the stack, if the knowledge level of the stack did not contain reliable knowledge (too many zeros in it, or just the least recent elements being non zeros), a 0 was returned, indicating a non reliable estimated value, with a middle reliability level (some zeros in the stack, and specifically some of the in the most recent values being zeros), a weighted integration was performed over the values, similar to an average, but the more recent values were taken in account more times for the average (zeros not taken in account for this process), finally, if the stack was contained highly reliable knowledge (almost no zeros in the stack, specifically not on most recent values) a prediction for the next possible value was calculated using linear regression over the stack values without taking in account zeros. This last case was specifically important for the third weighting kernel, as the mean for this should be as near as possible to the next value, so a prediction was indispensable. On the other hand, with many gaps on the linear model, the prediction is not reliable, so the weighted integration is preferred in those cases. The whole process just described is presented in an oversimplified scheme on figure 4. Fig. 4. Adaptive scheme oversimplified block diagram So, as we can see, the proposed method accounts for different situations and states of the data being processed, creating an adaptive framework in order to achieve optimized extraction of relevant data from the signal. weighting kernels were used (no adaptation for kernels). For the case in which only the second weighting kernel was used, 378 beats out of 435 counted (86.9%) were not flagged, as expected, the performance was reduced using only one of the kernels. Surprisingly, for the case where only the third (Gaussian) weighting kernel was used, we obtained 351 non flagged beats out of 414 counted (84.8%), presenting a discrepancy on the number of beats counted. We analyzed some of the discovered beats, and in fact some of them were different from the visually detected beats, and this is because the Gaussian shape tends to find what it is expecting to find, following the trend of the mean in certain cases. The adaptation of the algorithm appears to be giving good results. For assessment of the algorithm, a test was driven over the data selected comparing the algorithm results against manual (human) extraction as suggested on previous work [1]. The human operator extracted 440 pulses out of the sequence (Vs. 435 for automated extraction), we also compared the set of positions for manual and automated extraction, 381 of the 440 pulses were coincident (86.6%). And the absolute difference between means of the two algorithms was only 7.76%. A final note is that the flag error marking system could be a good indicator for a human operator of points of interest (possible errors) after automated classification. V. CONCLUSIONS The results obtained show that the signal extraction algorithm is showing a good performance (86.6% of coincidence with human extraction, and 7.76% of difference between means), and although it is not working in a perfect way, it could be used if the limitations of the algorithm are taken in account. Another thing to note is that the flagging system shows to be a good indicator of failure for the algorithm. We also note that the adaptive nature of the algorithm lead to improved performance. Future work should include tests on variations for the adjustable parameters of the algorithm and shapes of the weighting kernels. Also, further study can be conducted over the adaptive algorithms proposed. Finally, some work that is in progress includes the combination of signals from multiple channels obtained from the Holter device, which can be combined by statistical means to obtain a more clean signal to analyze. REFERENCES [1] [2] IV. RESULTS We conducted tests over 32768 selected samples presenting noise contamination. We ran different stages of the algorithm to account for performance of the methods used. For the final adaptive method, 387 beats out of 435 detected (89.0%) were not flagged; the non flagged status indicates that a pulse has a good chance to have been accurately detected. Then we conducted tests where only one of the two last [3] [4] [5] [6] T. Chau, S. Rizvi, Automatic interval extraction from long, highly variable and noisy gait timing signals. Human Movement Science, vol. 21, 2002. Elsevier Science B.V., pp. 495-513 P. Ch. Ivanov, Z. Chen, K. Hu and E. Stanley. Multiscale aspects of cardiac control. Physica A, vol 344, 2004. Elsevier B.V., pp. 685-704 A. L. Goldberger , L. N. Amaral , J. M. Hausdorff , P. Ch. Ivanov , C.K. Peng, E. Stanley. Fractal dynamics in physiology: Alterations with disease and aging. PNAS, Feb. 2002, vol. 99, Sup. 1, pp. 2466-2472 K. Payne. Ambulatory Electrocardiogram. WebMD Health. [Online resource]. Last update January 26, 2004. Available: http://www.webmd.com/hw/heart_disease/aa10253.asp J. G. Proakis, Digital communications, 3rd ed., Ed. McGrawHill, 1964, pp. 11, 61–62. Berkow, Robert. The Merck Manual of Medical Information. Second home edition. Merck, section 3, chapter 27 “Abnormal Heart Rhythms”.