K-means Clustering for Sleep Spindles Classification

advertisement
International Journal of Information Technology & Computer Science ( IJITCS ) (ISSN No : 2091-1610 )
Volume 10 : Issue No : 3 : Issue on : July / August , 2013
K-means Clustering for Sleep Spindles Classification
Joao Caldas da Costa
EIM Training and UNINOVA, University Nova of Lisbon
Gold Coast, Australia
joao.caldas.costa@gmail.com
Manuel Duarte Ortigueira
UNINOVA and Department of Electrical Engineering, University Nova of Lisbon
Lisbon, Portugal
mdo@fct.unl.pt
ArnaldoGuimarães Batista
UNINOVA and Department of Electrical Engineering, University Nova of Lisbon
Lisbon, Portugal
agb@fct.unl.pt
Abstract :
Changes in EEG sleep spindles constitute a promising indicator of sleep disorders. In this paper
SleepSpindles are extracted from real EEG data from patients suffering from any kind of brain illness. In this
paper a triple (STFT, WT and WMSD) algorithm for sleep spindle detection is used. Its performance is studied
and quantified. After the detection and isolation, an ARMA model is applied to each spindle. The mean of the
parameters of the ARMA model corresponding to all the detected spindles for each patient is computed and
finally, these parameters are used in a k-means clustering classification algorithm to assign a given illness to
each patient.
Keywords - ARMA; Sleep Spindles; EEG; k-means clustering
I. INTRODUCTION
Sleep spindles (SS) are particular EEG patterns which occur during the sleep cycle with center
frequency in the band 11.5 to 15 Hz. They are used as one of the features to classify the sleep stages [1]. Sleep
spindles are promising objective indicators in sleep disorders. In order to interpret then, their structure needs to
be clarified or a suitable model needs to be found. The correct detection of human SS and posterior
characterization can lead to early detection of changes in brain and prevent or, at least, mitigate the influence of
certain diseases [2].
Three methods have been used in the SS detection. The Short Time Fourier Transform (STFT) method
relies in the fact that after the transform has been applied to a signal containing a SS, a peak will occur in the SS
frequency range. The Wavelet Transform (WT) uses the normalized wavelet power to detect sleep spindles.
Wave Morphology for Spindle Detection (WMSD) directly mimics manual visual scoring. The methods are
combined using an AND algorithm.
In this work, ARMA model for sleep spindles is used to detect meaningful differences when applied to
spindles from people with pathologies. After a SS is correctly identified and isolated, an ARMA model is
This Paper was Presented on : 2nd International Conference on Computer Science, Information System &
Communication Technologies ( ICCSISCT 2013 )- Sydney , Australia on June 18 – 19 , 2013 …… Page...77
International Journal of Information Technology & Computer Science ( IJITCS ) (ISSN No : 2091-1610 )
Volume 10 : Issue No : 3 : Issue on : July / August , 2013
applied to it. Once the ARMA parameters are obtained, a k-means clustering classification algorithm is applied
to the data in order to classify each patient.
The paper outlines as follows. In section 2 we present a brief description of Sleep Spindles and their
characteristics, a description of the detection methodology and statistical measures are described. In section 3
theARMA model is introduced and in section 4 k-means clustering is explained. In section 6 experimental
results are presented. At last we draw some conclusions.
II. SLEEP SPINDLES: DETECTION AND METHODS
A. Sleep Spindles
It is commonly referred in literature that sleep spindles are the most interesting hallmark of stage 2
sleep electroencephalograms (EEG) [1]. A sleep spindle is a burst of brain activity visible on an EEG and it
consists of 11-15 Hz waves with duration between 0.5s and 2s in healthy adults with amplitude up to 30 μV. A
SS is present in Fig.2, starting at 1.45s with duration of 0.6s.
The spindle is characterized by progressively increasing, then gradually decreasing amplitude, which
gives the waveform its characteristic name. There is a consensus that SS are originated in the thalamus and can
be recorded as potential changes at the cortical surface [3].
Sleep spindles were first described in human EEG by Loomis in 1935, but the first commonly
accepteddefinition of sleep spindle was given by Rechtschaffen and Kales [4]:
“The presence of a sleep spindle should not be defined unless it is of at least 0.5sec duration, i.e., one shouldbe
able to count 6 or 7 distinct waves within the half-second period. Because the term “sleep spindle” has been
widely used in sleep research, this term will be retained. The term should be used only to describe activity
between 12 and 14 cps”.
B. The Detection Algorithm
Recent SS detectors are based on methods that include fuzzy logic, neural networks, bandpass filter,
fast time frequency transform, Fourier transform and wavelet transform. The majority of the proposed
algorithms are –directly or indirectly – based on amplitude-frequency analysis, thus banking on spindle
definition and mimicking visual analysis.[5]
Figure 1. The detection algorithm.
In this work, SS are detected using a combination of Wavelet Transform (WT), Short Time Fourier
Transform(STFT) and Wave Morphology for Spindle Detection (WMSD) algorithms. A vector is used to
characterize the signal (same length as the sampled signal).This vector defines each point of the sampled signal
This Paper was Presented on : 2nd International Conference on Computer Science, Information System &
Communication Technologies ( ICCSISCT 2013 )- Sydney , Australia on June 18 – 19 , 2013 …… Page...78
International Journal of Information Technology & Computer Science ( IJITCS ) (ISSN No : 2091-1610 )
Volume 10 : Issue No : 3 : Issue on : July / August , 2013
as belonging to a SS or not. The mixed result is computed, i.e., a point is considered belonging to a SS if it is
marked as SS in WT,STFT and WMSD algorithms. Finally, if there are not enough consecutive points marked
as belonging to a SS, inorder to last at least 0.5 seconds, they are considered as non-spindle. The method is
summarized in Fig. 1.
The use of STFT is commonly used in signal processing [6] and the STFT of a discrete signal is defined as:
The magnitude squared of the STFT yields the spectrogram of the signal:
The SS detection is based on the spectrogram. SS are detected when peaks are found in the 11-15Hz
range.An example of detection of SS using STFT and corresponding spectrogram can be seen in Fig.2. It is clear
the presence of peak in the spectrogram (t=1.4-2.0s an f_12Hz), corresponding to a SS.
Figure 2. SS detection using STFT, WT and WMSD algorithms.
The SS detection with WT used spindles employs the continuous wavelet transform of EEG signal x(t)
defined as:
whereψ(t) is called the ‘mother wavelet’, the asterisk denotes complex conjugate, whereas a and b are scaling
parameters [7]. The corresponding normalized wavelet power is defined by:
andσ is the standard deviation of the EEG segment used. Complex Morlet WT was used. SS are detected when
normalized wavelet power is above a certain threshold. In Fig. 2 a SS is detected using the normalized wavelet
power.
This Paper was Presented on : 2nd International Conference on Computer Science, Information System &
Communication Technologies ( ICCSISCT 2013 )- Sydney , Australia on June 18 – 19 , 2013 …… Page...79
International Journal of Information Technology & Computer Science ( IJITCS ) (ISSN No : 2091-1610 )
Volume 10 : Issue No : 3 : Issue on : July / August , 2013
The WMSD algorithm used in this paper is based on the definition of Sleep Spindle by Rechtschaffen
and Kales [4]. The whole process mimics the visual detection mechanism and this algorithm was for the first
time published by the authors in [8].The implemented algorithm consists of:
a) Detection of peaks in the signal (maxima and minima), based on a defined threshold, thus, eliminating small
peaks (in Fig. 2 they are marked with a “•”);
b) Determination of extreme to extreme time distance and conversion to frequency:
c) Verification if the determined frequencies lie in the SS range (11-15 Hz) (peaks satisfying this condition are
marked with an “*” in Fig. 2);
d) If there are more than 12 consecutive peaks (6 maxima and 6 minima) in the SS frequency band as spindle is
marked (peaks satisfying this condition are marked with “□” in Fig. 2).
C. Statistical Measures and algorithm performance
In order to assess the validity of results, the algorithm was applied to the data and results compared
with visually scored signal. Measures were taken, namely true positive (TP), false positive (FP), true negative
(TN) and false negative (FN) events.
Figure 3. Statistical measures; TP: true positive, TN: true negative, FP: false positive and FN: false negative
regions. Here discrepancy was enhanced for demonstrative purposes..
A TP result is counted when a sample was scored as a spindle by the automatic method and the
expertsimultaneously. A TN result is set when a correct decision of absence of spindle was made. If the
automatic result indicated a presence of spindle and there was no spindle visual scoring, a FP result was
counted. On the opposite,if the output indicated no spindle whereas the expert scored some, a FN result was
counted (Fig. 3).
Sensitivity, specificity and accuracy are defined as:
This Paper was Presented on : 2nd International Conference on Computer Science, Information System &
Communication Technologies ( ICCSISCT 2013 )- Sydney , Australia on June 18 – 19 , 2013 …… Page...80
International Journal of Information Technology & Computer Science ( IJITCS ) (ISSN No : 2091-1610 )
Volume 10 : Issue No : 3 : Issue on : July / August , 2013
In [9] a comparison of the threshold choice is presented based on a EEG signal partly scored by a human expert.
It is possible to determine the optimal threshold value by analyzing the Sensitivity x Specificity curve
(Fig.4).As sensitivity improves towards the top and specificity improves towards the right, the optimal point on
the curve is the point nearest to the top right corner. The best result obtained is the combination of the 3
algorithms (black line with “*” marks), with a sensitivity and specificity around 94%.
Figure 4. Sensitivity x Specificity curves for the implemented algorithms.
III. ARMA MODEL
In signal processing, autoregressive moving average (ARMA) models are typically applied to
correlated time series data. Given a time series, we can consider it as the output of an ARMA system driven by
white noise. The ARMA model is a tool for understanding and, whenever necessary, predicting future values in
time series. The model consists of two parts, an autoregressive (AR) part and a moving average (MA) part. The
model is usually referred to as ARMA(p,q) where p is the order of the autoregressive part and q is the order of
the moving average part .
Compared with the pure MA or AR models, ARMA models more suitable for describing the
characteristics of a given process with minimum number of parameters using both poles and zeros, rather than
just poles or zeros[10].
As referred, a stationary ARMA process of order (p,q) is considered as the output of a linear time
invariant(LTI) digital filter driven by white noise. The transfer function of the system is given by:
with a0=1. The process corresponding to this model satisfies the difference equation:
This Paper was Presented on : 2nd International Conference on Computer Science, Information System &
Communication Technologies ( ICCSISCT 2013 )- Sydney , Australia on June 18 – 19 , 2013 …… Page...81
International Journal of Information Technology & Computer Science ( IJITCS ) (ISSN No : 2091-1610 )
Volume 10 : Issue No : 3 : Issue on : July / August , 2013
where w(n) is the input sequence, a zero-mean white noise and x(n) is the output sequence. The main task in the
modeling can be formulated as:
Given a segment of a time series, x(n), n=0,1,2 …, L-1, estimate the p+q+1 ARMA parameters.
It was used the “armax.m” command from the “Systems Identification Toolbox” in Matlab to perform
the ARMA modelation, thus, obtaining the A and C parameters of the equation:
Figure 5. Pole-Zero Map of a SS and corresponding ARMA model.
In Fig. 5 a Pole-Zero Map of a identified SS is presented together with it’sARMA(5,1) model. The
model was chosen to be ARMA(5,1), has it was the one with best results is [11].
IV. K-MEANS CLUSTERING
In data mining, k-means clustering is a method of cluster analysis which aims to partition n
observations into k clusters in which each observation belongs to the cluster with the nearest mean [12].
Given a set of observations (x1, x2,… ,xn), where each observation is a d-dimensional real vector, kmeans clustering aims to partition the n observations into k sets (k≤ n) S = {S1, S2,…. Sk} so as to minimize the
within-cluster sum of squares:
Where μi is the mean of points in Si.
The function “kMeansCluster.m” from KardiTeknomo. For a complete description of the algorithm please refer
to[13].
V. EXPERIMENTAL RESULTS
This study makes use of a sample representative of human sleep, obtained from 19 volunteers, males
and females with ages between 35 and 87 years old. Briefly, all polysomnograms were obtained by a Nicolet
EEG1A97 18-channel polygraph with a sampling rate of 256Hz. From the group, 8 subjects were completely
This Paper was Presented on : 2nd International Conference on Computer Science, Information System &
Communication Technologies ( ICCSISCT 2013 )- Sydney , Australia on June 18 – 19 , 2013 …… Page...82
International Journal of Information Technology & Computer Science ( IJITCS ) (ISSN No : 2091-1610 )
Volume 10 : Issue No : 3 : Issue on : July / August , 2013
healthyand the remaining had one or more disorders, namely: REM sleep behavior disorder (RBD), periodic
limb movement disorder (PLMD), insomnia or epilepsy.
The patients’ disorders are:
o
Rapid eye movement sleep behavior disorder (RBD): a sleep disorder that involves abnormal behavior
during the sleep phase with rapid eye movement (REM sleep).It is characterized by the dreamer acting
dreams.These dreams often involve kicking, screaming, punching, grabbing and even jumping out of
bed;
o
Periodic limb movement disorder (PLMD): a sleep disorder where the patient moves limbs involuntarily
during sleep, and has symptoms or problems related to the movement;
o
Insomnia (or sleeplessness): a well known sleep disorder in which there is an inability to fall asleep or to
stay asleep as long as desired;
o
Epilepsy: a common and diverse set of chronic neurological disorders characterized by seizures.The
signals were unclassified and the whole night signal of C3-A2 channel was used. A total of 14130 SS
have been detected by the algorithm.
The detection methods were applied with a combination of threshold parameters for the STFT, WMSD
and WT algorithm. In the STFT case, the threshold value used corresponds to the cumulative value of peaks in
the spectrogram. In the WMSD algorithm, a point is considered a maximum peak if it has the maximal value,
and was preceded (to the left) by a value lower than the threshold defined. The Normalized Wavelet Power
amplitude is used as threshold in the WT case.
K-means clustering has been applied to the arithmetic means of the coefficients from the ARMA
transfer functions. The value N=2 has been selected, in order to determine if a patient has any pathology or not.
So, it was expected that patients with any kind of pathology would all lie on the same group whereas patients
with no disorders should lie in the other.
The majority of results were as expected (table 1). Discrepancies occurred for two patients (marked in
red):
•
Pat11, suffering from RMD and Epilepsy has been classified in groups 1 (healthy patients);
•
Pat23, suffering from PLMD and Insomnia has been classified in group 1 (healthy patients);
TABLE I. PATIENTS DISORDERS AND RESULTS FROM K-MEAN CLUSTERING
This Paper was Presented on : 2nd International Conference on Computer Science, Information System &
Communication Technologies ( ICCSISCT 2013 )- Sydney , Australia on June 18 – 19 , 2013 …… Page...83
International Journal of Information Technology & Computer Science ( IJITCS ) (ISSN No : 2091-1610 )
Volume 10 : Issue No : 3 : Issue on : July / August , 2013
VI. CONCLUSION
In this work, sleep spindles that were automatically scored using a triple detection automatic algorithm
were modelled using an ARMA(5,1) model. After the model was applied to the signals, k-means clustering was
used to distinguish patients with any sleep related disorder from normal persons. The ARMA model proved to
be useful to distinguish sleep spindles from patients suffering from sleep related disorders. Further modelling
needs to be carried out in order to correct distinguish different pathologies. K-means clustering provided a
powerful tool for patient differentiation.
ACKNOWLEDGMENT
The authors would like to acknowledge sleep laboratory CENC – Centro de Electro encefalo grafia
eNeuro fisiologia Clinica for providing the data used for this work.This work was funded by EIM Training
(Queensland Australia) and by Portuguese National Funds through the FCT – Foundation for Science and
Technology under the project PEst-OE/EEI/UI0066/2011.
REFERENCES
[1] L. De Gennaro and M. Ferrara, “Sleep spindles: an overview”, Sleep Med Rev 7:423–40, 2003.
[2] J.C. Costa, M.D. Ortigueira and A. Batista, “ARMA Modelling of Sleep Spindles”, Proceedings of the
Doctoral Conference on Computing, Electrical and Industrial Systems, DoCEIS'11 - IFIP AICT 349, pp
341-348, 2011.
[3] M. Steriade, E.G. Jones and Llinas, “Thalamic Oscillations and Signaling”. Neuroscience Institute
Publications. New York: John Wiley & Sons, 1990.
[4] A. Rechtschaffen and A. Kales, “A manual of standardised terminology, techniques and scoring system for
sleep stages of human subjects”, Washington, DC: Public Health Service, U.S. Government Printing Office;
1968.
[5] A .Nonclercq , C. Urbain, D. Verheulpen, C. Decaestecker, P. Van Bogaert and P. Peigneux, “Sleep spindle
detection through amplitude-frequency normal modelling”, Journal of Neuroscience Methods , 2010.
This Paper was Presented on : 2nd International Conference on Computer Science, Information System &
Communication Technologies ( ICCSISCT 2013 )- Sydney , Australia on June 18 – 19 , 2013 …… Page...84
International Journal of Information Technology & Computer Science ( IJITCS ) (ISSN No : 2091-1610 )
Volume 10 : Issue No : 3 : Issue on : July / August , 2013
[6] J. Proakis, and D. Manolakis, “Digital Signal Processing”, 4th Ed., Prentice-Hall, 2006.
[7] I. Omerhodzic, S. Avdakovic, A. Nuhanovic, K. Dizdarevic and K. Rotim, “Energy Distribution of EEG
Signal Components by Wavelet Transform”, pp45-60 IInTech publishing, 2012.
[8] J.C. Costa, M.D. Ortigueira, A. Batista and T. Paiva, “An Automatic Sleep Spindle detector based on WT,
STFT and WMSD”,International Journal of the World Academy of Science, Engineering and Technology,
issue 68, pp1298-1301, 2012.
[9] J.C. Costa, M.D. Ortigueira, A. Batista and T. Paiva, “Threshold choice for automatic spindle detection”.
Proc. IWSSIP2012; 2012
[10] A. Kizilkaya and A. H. Kayran, “ARMA model parameter estimation based on the equivalent MA
approach”. Digital Signal Processing, Vol 16, Issue 6, 2006.
[11] J.C. Costa, M.D. Ortigueira, A. Batista and T. Paiva. “ARMA Modelling of Sleep Spindles”, Proceedings
of the Doctoral Conference on Computing, Electrical and Industrial Systems, DoCEIS'11 - IFIP AICT 349,
pp 341-348, 2011.
[12] K-means clustering. (2012, August 2). In Wikipedia, The Free Encyclopedia. Retrieved 16:50, December
15, 2013, from http://en.wikipedia.org/w/index.php?title=K-means_clustering&oldid= 505438129
[13] K. Tekmono. (2013, March 12), in http://people.revoledu.c
This Paper was Presented on : 2nd International Conference on Computer Science, Information System &
Communication Technologies ( ICCSISCT 2013 )- Sydney , Australia on June 18 – 19 , 2013 …… Page...85
Download