Torres - Automated adaptive heart beat interval extraction

Automated adaptive heart beat interval extraction
from long term noisy and variable ECG signals.
Jorge Torres
Abstract— There are several situations in which a biophysical
measurement can become contaminated with noise. When this
noise is severe, there is a need for algorithms to extract the
important patterns from the signal in question. In this work we
discuss the development of an automated adaptive method to
obtain the time between heart beats from a noisy and highly
variable electrocardiography signal. The method proposed gave
good results when compared with manual extraction (86.6% of
coincidence for beat detection, 7.76% of difference between
detected means), and can be used if the limitations of the
algorithm are taken in account. Further research should be
conducted to test and improve the robustness of the algorithm.
Index Terms— Adaptive estimation, electrocardiography,
nonlinear dynamics, signal processing.
HERE are many cases where biophysical measurements
become contaminated with noise, and sometimes these
measurements possess important information about an
experiment or a patient. In recent years, the advent of portable
biomedical devices has increased our capacity to obtain longterm measurements of biophysical parameters away from the
hospital, such as Holter heart monitors for instance. In these
long term recordings, the device transducers may move away
from their original positions, or exhibit conductivity problems
as the patient walks away from the hospital, thereby,
introducing noise to the measurements. These measurements
are important as they can contain important information for
patient diagnosis or for research in the measurement and data
processing fields ([1], [2]). It would be extremely time
consuming to repeat measurements every time we encountered
such noise contaminations.
In these situations, the data can be manually categorized and
analyzed by an expert in the field, but it is again time
consuming, and for large sets of data this could represent an
insurmountable quantity of work. Therefore, we propose to
create an automated pattern extraction algorithm, capable of
extracting important features from noisy ECG data.
Some previous work has shown good results working on
digital, two-state signals [1]. This motivated us to create an
algorithm to extract quasi-periodic features from analog
signals. In this case, we were interested in extracting the time
between heart beats from the noisy signal. The extraction of
this parameter is important to fields like nonlinear dynamics
[2], [3] where measurements of quasi-periodic signals (as is
the heart beat) show some long-range correlations or long term
memory for young and healthy individuals.
A. The nature of the data analyzed
The data that are being analyzed come from
electrocardiography measurements, and for the aims of this
paper, we are trying to extract the quasi-periodic temporal
content of these signals, so we can study the nonlinear
dynamics of the data. The signal is said to be quasi-periodic
because there are long term natural fluctuations between
heartbeats [3], but the frequency of the heart beats can change
with the activity of the subject. In this particular case, a Holter
heart monitor was used to obtain the data [4]. This is a device
intended for ambulatory electrocardiography, and will record
heart data during normal activity for a long period of time
(normally a full day). This is accomplished by attaching
electrodes to the chest of the patient, and the recording device
is carried along with him. The specific device used to obtain
the data had a sampling frequency of 128 samples per second.
Now, about the noise in the signal, there are several sources.
One of the most important sources of noise is the poor
electrical conduction due to movement of the electrodes.
However, we also have intrinsic electrical noise from the
device components, noise induced by the body, first by its own
electrical currents, and also by working as an antenna for
electrical radiation (as electric power frequency and radio
signals to a minor extent). So the noise captured by the device,
especially when the conductivity of the signal is poor, is a
combination of different sources. Therefore, the noise added to
the signal is considered to have a Gaussian distribution, as it is
known that when different sources of noise are added, the
resulting noise tends to Gaussian as the number of noise
sources increase [5]. Some parts of the algorithm created in
this work are based in this assumption.
Then the resulting ideal signal expected from a Holter is a
sequence of pulses in time, which should correspond to each
heart beat. Graphs from clean and corrupted Holter
measurements are shown in figure 1. As we can see, in the case
of the clean signal the event of a heart beat can be easily
distinguished from the peaks of the signal, that can be detected
by a rejection threshold, but in the noisy case this technique
would lead to obtain many errors because of the distorted
nature of the signal, also it can be seen that low frequency
components of the noise are driving the DC component of the
signal up and down randomly. Even though that the pulses can
still be distinguished by careful observation, this could be a
long task as it implies several hours of data to characterize.
Where p is the index of the data where the last beat was
found. Then, differences ‘Y’ around the mean were found in
the window, this in order to find the maximum difference.
Y (i )  X (i )  X
Finally, to counteract the effects due to rising or falling
trends in the signal, we calculated the correspondent linear
regression values of these differences, denoted by ‘YR’, and we
obtained a set of “non trended” differences ‘YN’ by subtracting
the regression values from the trended data.
YN (i)  Y (i)  YR (i)
Fig. 1 Examples of clean timing signal (top) and noisy signal (bottom)
Simple filtering would not be appropriate to solve this
problem because of many reasons. First, as this is a long term
signal obtained from electrocardiography, the heart beat period
is inherently changing with activity of the individual (and
therefore the frequency), and if we filtered the signal with
fixed filters, the filtered signal could loose part of the
information and also arrhythmia events which could be
important for the study in course. Also, it has been shown
previously that the noise in this case has components of low
frequency, and the signal we are trying to extract is of low
frequency, so probably a filter would capture a good amount of
noise. Then, a different approach had to be used, and should
be adaptive to account for changes in the signal.
As the peaks are still noticeable in most of the cases in the
noisy signal, we thought on working in the time domain, trying
to simulate what a human operator would do to find the pulses.
First, we know that heart beat time intervals have normally
characteristic values, as stated by the Merck manual of medical
information, the normal heart beat rate for an adult at rest
ranges from 60 to 100 heart beats per minute, with slightly
lower rates for physically fit persons [6]. Then we know that in
normal situations the heart beat rate is quasi-periodic, so we
can predict the next heart beat from previous ones. We know
there could be some situations that can cause localized
fluctuations (arrhythmias), but normally these fluctuations
won’t appear often unless the subject is experiencing some
heart disease. This knowledge is taken in consideration when a
human looks at the graph, so he will start to look for periodic
patterns of peaks rising from the distorted signal, and looking
for these peaks with time separations around the common
values of time for normal heart beat rates.
From here, we know the lowest time period to expect for
finding another pulse from the previous one, so the first thing
that we did was extracting windows ‘X’ from the data
collection ‘D’, where at least one heart beat should appear, so
we decided to use a window of size N samples (for this work
we used N=170, corresponding to a minimum of
approximately 45 heart beats per minute, but most of the
parameters are easily adjustable in the code, written in
Matlab), so the window is defined as:
X (i )  D( p  i ) ; i  1, 2,..., N
Then for the purpose of finding a pulse within a window, we
decided to give weights for the elements in the window, giving
more weight to elements that could potentially posses the next
expected pulse, i.e. samples separated milliseconds or far away
from the previous beat would not be likely another beat, but
spurious repetitions of the previous beat or probably the
second next beat, so low weight was given to these zones, and
samples separated a second from the previous beat have good
chances to be the next beat, so more weight was given then to
this zone. In fact, we designed three weighting factors of the
same size N of the window selected, reflecting probabilistic
models of the signal that was being studied. The first one was
called the discovery weighting factor, and the other two were
called beat seeking weight factors. This is because the first one
is used in stages where it seems that we have lost the beats to
start our next window (so we need to discover a possible beat),
and the others are meant for the process of finding probable
beats based on a previous discovery stage.
The first (discovery) weighting factor gives decreasing
weights in a negative slope shape, so more weight is given to
elements of the factor window that are nearer to the beginning,
this is because we want to find the first possible available
pulse (as we are in the discovery stage).
The second (truncated trapezoid) and third (Gaussian)
weighting factor shapes are selected so no weight is given to
elements of the windows that are right next of the previous
pulse, in order to completely reject spurious repetitions of it.
More weight is given to the region where a normal heart beat
should appear, and the weight is decreased on zones where
another beat is less likely to appear by experience. For the
Gaussian weighting factor more weight is going to be given to
the values of the window around its mean. The standard
deviation is chosen in order that the points where the
distribution has half of the maximum value, correspond to the
range of 40 heart beats per minute (100 - 60). This technique is
similar to the 3 dB rejection zone for filter design. The mean is
variable, depending on a learning adaptive algorithm. The
shapes of the weighting factors are shown in figure 2.
Then we calculated the weighted differences ‘YW’ from the
non trended values obtained before, multiplying each value of
‘YN’ for the corresponding weight on the weighting factor
chosen ‘WX’. The process of choosing a weighting factor will
be explained later. Figure 3 illustrates the effect of weighting.
YW (i)  YN (i ) WX (i )
Fig 2. First (top), second (middle) and third (bottom) weighting kernels
The location of the selected beat in the window was given
by the maximum value on the weighted result. As we can see
in figure 3, the weighting process gives more probability to be
chosen to pulses that are around the expected values.
The adaptive process is one of the most important parts of
the algorithm, as it is the core in which the efficiency of the
pulse selection is based. For this purpose a finite stack was
created to store past heart beat interval values, this is the
memory used for the decision making process. In this stack,
when a new element is added, the least recent entry is
discarded. The stack is initialized with zeros in all the
locations, and zero is accounted for an “invalid entry” in the
stack, meaning that the particular interval detected in that
location was not reliable and therefore was rejected by some
filters. This stack is the base of a “learning” process, it accepts
a wide range of inputs when it has got many zeros in it (low
knowledge), but as it starts learning (capturing non zero
entries), becomes more selective for the new values to be
accepted in. When there are many nonzero values in the stack,
the knowledge of the system is told to be reliable. On the other
hand, when the values that want to enter to the stack are
continuously rejected (giving entry to a certain amount of
zeros to the algorithm, which triggers a predefined state), then
the stack knows that probably the data analyzed have changed,
and switches to a “relearning” process, at which it will become
less selective to the values that are allowed to get into it, in
order to gain new knowledge of the information that appears to
have changed. The learning state of the stack is defined by the
amount of zeros in it, and so it is the weighting window
selected at each specific evaluation stage.
At the beginning, the stack is filled with zeros, because the
algorithm has no valid knowledge of the data to be evaluated.
Then the look for pulses begins, the first (discovery) kernel has
a shape intended for capturing high peaks, and give
preferences to peaks near to the origin. When a probable beat
is found, a hypothesis test is done, taking 30 samples from
each side of the pulse, and the pulse itself as a population; the
hypothesis is that this pulse value is part of the population (the
surrounding noise), that is modeled by a Gaussian distribution,
because of the assumption explained in the problem statement.
If the hypothesis is rejected, then the beat found is accepted (if
the pulse has low probability to be part of the surrounding
noise, then it is an outstanding peak), we record that the pulse
was accepted (or we continue the search until we find an
acceptable pulse), then we start to look for more pulses in this
fashion, the second pulse is evaluated, and this time, if the
pulse is accepted, we see it this falls within reasonable
separation from the previous supposed heart beat (basically not
very near), if this is accomplished, we store a nonzero value in
the stack (the distance between betas found), if not we
continue the search in this fashion until we start finding a
decent pulse succession (indicated by the amount of nonzero
values on the stack) if a relatively small number of nonzero
values are found in the most recent values of the stack, and the
variance calculated from the stack (not taking in account the
zeros) is below a threshold, we start using the second
weighting kernel. Any beat found is recorded in a vector as the
output of the algorithm. Also, a flag system is used, for each
beat recorded, a flag is stored. If the flag is 0, indicates that the
beat found was not rejected by the discrimination process of
the specific knowledge level, if the flag has o nonzero value,
the pulse was rejected somehow, and the value of the flag
indicates the criterion of rejection.
Fig 3. Example of weighting effect. Top: non trended differences, Middle:
weighting kernel, bottom: weighted result
In the second weighting kernel, as can be seen in figure 2,
we are giving more weight to pulses that fall within the
expected values for normal heart beats. In this stage we also
check for the integrity of the pulse using the expected value
obtained from our stack (Explained later). Here to see if the
pulse fell on expected values for normal heart beats, we make
a hypothesis testing, to see if the heart beat interval found in
this window falls within a normal distribution which mean is
the expected value obtained from the elements in the stack, in
this fashion we accept or reject beat intervals in the stack.
Again, if the number of non zero elements in the stack
raised to a certain threshold, and the variance for the elements
on the stack was below a limit we started to use the third
weighting kernel (Gaussian shaped), being the mean for this
kernel the expected value calculated from the stack. This is
done because a large number of nonzero values indicates a
good confidence for the algorithm, as the rejection method
used at each level insures that we are refining the search to
more “intelligent” extraction as the stack learns from the data
that is being processed. On the other hand, this specificity
gives raise to more selectivity for the values admitted in the
stack, so if the values are being rejected (zeros in the stack), it
can be an indicative that hard noise is invading that particular
region of the signal, leading to a reduction on confidence level
on the stack, switching to a less specific weighting kernel. In
cases where the signal becomes very distorted, this is good, as
the Gaussian rejects a lot of values by the weighting process,
so this relearning stage insures that we are not “finding what
we are expecting to find” by only using the Gaussian kernel.
Now, for estimating the expected value from the memory in
the stack, we had three cases based again in the learning level
of the stack, if the knowledge level of the stack did not contain
reliable knowledge (too many zeros in it, or just the least
recent elements being non zeros), a 0 was returned, indicating
a non reliable estimated value, with a middle reliability level
(some zeros in the stack, and specifically some of the in the
most recent values being zeros), a weighted integration was
performed over the values, similar to an average, but the more
recent values were taken in account more times for the average
(zeros not taken in account for this process), finally, if the
stack was contained highly reliable knowledge (almost no
zeros in the stack, specifically not on most recent values) a
prediction for the next possible value was calculated using
linear regression over the stack values without taking in
account zeros. This last case was specifically important for the
third weighting kernel, as the mean for this should be as near
as possible to the next value, so a prediction was
indispensable. On the other hand, with many gaps on the linear
model, the prediction is not reliable, so the weighted
integration is preferred in those cases. The whole process just
described is presented in an oversimplified scheme on figure 4.
Fig. 4. Adaptive scheme oversimplified block diagram
So, as we can see, the proposed method accounts for
different situations and states of the data being processed,
creating an adaptive framework in order to achieve optimized
extraction of relevant data from the signal.
weighting kernels were used (no adaptation for kernels). For
the case in which only the second weighting kernel was used,
378 beats out of 435 counted (86.9%) were not flagged, as
expected, the performance was reduced using only one of the
kernels. Surprisingly, for the case where only the third
(Gaussian) weighting kernel was used, we obtained 351 non
flagged beats out of 414 counted (84.8%), presenting a
discrepancy on the number of beats counted. We analyzed
some of the discovered beats, and in fact some of them were
different from the visually detected beats, and this is because
the Gaussian shape tends to find what it is expecting to find,
following the trend of the mean in certain cases. The
adaptation of the algorithm appears to be giving good results.
For assessment of the algorithm, a test was driven over the
data selected comparing the algorithm results against manual
(human) extraction as suggested on previous work [1]. The
human operator extracted 440 pulses out of the sequence (Vs.
435 for automated extraction), we also compared the set of
positions for manual and automated extraction, 381 of the 440
pulses were coincident (86.6%). And the absolute difference
between means of the two algorithms was only 7.76%. A final
note is that the flag error marking system could be a good
indicator for a human operator of points of interest (possible
errors) after automated classification.
The results obtained show that the signal extraction
algorithm is showing a good performance (86.6% of
coincidence with human extraction, and 7.76% of difference
between means), and although it is not working in a perfect
way, it could be used if the limitations of the algorithm are
taken in account. Another thing to note is that the flagging
system shows to be a good indicator of failure for the
algorithm. We also note that the adaptive nature of the
algorithm lead to improved performance. Future work should
include tests on variations for the adjustable parameters of the
algorithm and shapes of the weighting kernels. Also, further
study can be conducted over the adaptive algorithms proposed.
Finally, some work that is in progress includes the
combination of signals from multiple channels obtained from
the Holter device, which can be combined by statistical means
to obtain a more clean signal to analyze.
We conducted tests over 32768 selected samples presenting
noise contamination. We ran different stages of the algorithm
to account for performance of the methods used.
For the final adaptive method, 387 beats out of 435 detected
(89.0%) were not flagged; the non flagged status indicates that
a pulse has a good chance to have been accurately detected.
Then we conducted tests where only one of the two last
