a review of speech assistive technologies using eog

advertisement
Anson Bastos, et al International Journal of Computer and Electronics Research [Volume 5, Issue 2, April 2016]
A REVIEW OF SPEECH ASSISTIVE
TECHNOLOGIES USING EOG
Anson Bastos
B.Tech, Electrical Department,
VJTI, Matunga
Siddharth Alhat
B.Tech, Electrical Department,
VJTI, Matunga
Pravin Dhurde
B.Tech, Electrical Department,
VJTI, Matunga
ansonbastos@gmail.com
sidharthalhat@gmail.com
Pravindhurde123@gmail.com
Abhishek Suryawanshi
B.Tech, Electrical Department,
VJTI, Matunga
Shubham Kathalkar
B.Tech, Electrical Department,
VJTI, Matunga
Dr.Prof.M.S. Panse
Dean Faculty,VJTI,Matunga
suryaabhi18@gmail.com
Kathalkarshubham@gmail.com
Abstract−Assistive technology is defined as any device
or item that can be used to increase, maintain or
improve the capabilities of individuals with disabilities
(IDEA, 1990). In this paper the aim is to review speech
assistive technologies devised for the people suffering
from motor neuron disorders using the principle of
electrooculography (EOG). This study would help people
working in this area of research to gain a better
understanding of the work that has been done and
promote further advances in the field.
mspanse@vjti.org
2. ELECTROOCULOGRAPHY
Electro-Oculography (EOG) observes the eye-movement
by recording the potential between the cornea and retina as
can be seen in the figure below. The EOG amplitude varies
from 0.05 to 3.5 mV in humans.
Keywords: Electrooculography, Electrooculogram
(EOG), Signal Processing, User Interface
1. INTRODUCTION
A motor neuron disease (MND) is any of the five
neurological disorders that selectively affect motor neurons,
the cells that control voluntary muscles of the body. These
five conditions are amyotrophic lateral sclerosis, primary
lateral sclerosis, progressive muscular atrophy, progressive
bulbar palsy and pseudobulbar palsy. They are
neurodegenerative in nature and cause increasing disability
and, eventually, death [1].It can affect any adult at any age
but most people diagnosed with the disease are over the age
of 40, with the highest incidence occurring between the
ages of 50 and 70. The incidence or number of people who
will develop MND each year is about two people in every
100,000. The prevalence or number of people living with
MND at any one time is approximately seven in every
100,000 [2].
People with this disease lose control over speech and
other voluntary muscle movements. However their eyes
remain unaffected for a very long period of time. This
makes eye controlled speech assistive technologies a viable
solution.
The paper first describes briefly the origin of EOG
signals from the eye and then compares various EOG based
systems on four different parameters. These parameters are
the placement of EOG electrodes, the conditioning of the
EOG signals, the processing of these signals and finally the
interface for typing.
©http://ijcer.org
e- ISSN: 2278-5795
Figure 1: Dipole Model of the eye
The origin of EOG signals is believed lie in the fact that
the photoreceptor cells are more negatively charged as
compared to the pigment epithelium (which is of the same
potential as the cornea) in which they are embedded. This
gives rise to a standing (rest) potential. When the eyes are
moved the depolarization (cell membrane becomes negative)
and hyper polarization (cell membrane becomes positive) of
these photoreceptors cause the potential to vary and we get
the electrooculogram signals. The amplitude is more under
lighting conditions than in the dark.
3. PLACEMENT OF ELECTRODES
The electrodes used in all the literatures are the Ag/AgCl
electrodes, as Ag is a slightly soluble salt, AgCl quickly
saturates and comes to equilibrium. Therefore, Ag is a good
metal for metallic skin-surface electrodes [3].
The literature by Chaudhuri[11] (et al., 2013) gives the
placement of electrodes as follows: the bipolar EOG
electrodes were placed on distal ends of the forehead, beside
the corner of the eye, and ground & reference electrodes
were placed on the middle of the forehead of the subjects.
So they have made use of two channel EOG electrodes.
p- ISSN: 2320-9348
Page 28
Anson Bastos, et al International Journal of Computer and Electronics Research [Volume 5, Issue 2, April 2016]
Similarly the paper by Aungsakun [12] (et al., 2012) uses
vertical-channel electrodes placed above and below the
right eye and horizontal-channel electrodes placed on the
right and left of the outer canthi. Additionally, a reference
electrode was placed on the forehead (G). They have made
use of a single electrode for the vertical channel.
The same has been followed by Desai [10] (2013),
Nathan [9] (2012), Soltani [8] (2013), Swami [4] (2014).
However Nathan [9] makes use of the reference electrode
on the mastoid. Usakli [7] (et al.. 2010) states that in
general, in EOG signal acquisition systems, a
reference/ground electrode is placed on the forehead;
electrodes are placed on the right and left temples for
horizontal (lateral) eye movement detection and above and
below an eye for vertical eye movement detection.Zhang
[6] (2015) has made use of single channel electrodes using
the ‘neurosky mindwave’ headset.
4. SIGNAL CONDITIONING
Chaudhuri [11] has made use of a 0.4-30 Hz band pass
filter and a sampling rate of 256 Hz. Desai [10] uses an
instrumentation amplifier at the initial stage and then
makes use of a 4th order high pass filter (0.1 Hz) and a 4th
order low pass filter (40 Hz).
use of the MATLAB based character recognition engine. It
compares the EOG signal to a template and determines the
letter typed by the eye
Figure 3: Plots of EOG signal, MATLAB generated and
estimated characterwhen the subject is writing English
alphabet ‘B’ by rotating the eyes (Swami [4]).
Usakli [7] has made use of the kNN algorithm in which
the metric used is the euclidean distance.
Wu [5] processes the signal in three stages: first is the 5
point moving average filter to remove power line noise,
second is the feature extraction and third is the classifier.
The feature extraction is done using two thresholds Th1,
Th2; When the signal level is above Th2 the value 2 is
assigned, when it is in between th2 and th1 the value is 1,
when in between -th1 and th1 the value is 0 and so on. There
are 5 values 2, 1, 0, -1, -2. Each movement (feature) has a
set of values and the horizontal and vertical sets are
classified by the classifier to detect the eye movement.
Figure 2: Acquisition block diagram and electrode position
by Desai et al.[10]
Aungsakun [12] makes use of an amplifier with a gain of
19.5 and a band pass filter (1 to 500 Hz) with a sampling
rate of 128.
Nathan [9] employs a 2-30 Hz BPF and a sampling rate
of 128 Hz. Soltani [8] samples the data at a rate of 240 Hz.
Usakli [7] employs a gain of 7000 in three stages, a 5th
order LPF of 30 Hz, a 50 Hz notch filter and in order to
remove dc drift a summing amplifier is used instead of a
high pass filter as a HPF reduces the signal as well.
Wu[5](2013) has made use of a gain of 5000 in two stages
of 5 and 1000, a HPF of 0.1 Hz, a LPF of 62.5 Hz and a
sampling rate of 256 Hz.
5. SIGNAL PROCESSING
Aungsakun [12] has made use of thresholds for upper
peaks and lower peaks and looks at the time for which the
signal is above the threshold to avoid spikes due to noise.
Chaudhuri [11] normalises the data by subtracting the
mean over N samples and keeps a threshold. Here also the
duration of the pulses are observed to avoid noise.Nathan
[9] also makes use of thresholds (50 uV, -50 uV) and
checks for the peak to reverse (i.e. crest to trough or vice
versa) after an interval of 0.078 sec. Soltani [8] takes the
derivative and uses an adaptive threshold to detect the
horizontal and/or vertical movements. Swami [4] has made
©http://ijcer.org
e- ISSN: 2278-5795
Figure 4:A Flow chart of the proposed EOG signal
classification method by Wu [5].
Zhang [6] makes use of wavelet filtering followed by
feature extraction and classification of the signals.
p- ISSN: 2320-9348
Page 29
Anson Bastos, et al International Journal of Computer and Electronics Research [Volume 5, Issue 2, April 2016]
6. USER INTERFACE
In his paper Desai [10] describes a quick interface for
typing by making use of multi-directional eye movements.
He makes use of the eye movements left, right, up, down,
up left, up right, down left, down right. The screen is
numbered for the user’s reference. In the centre there is a
move box which tells the user when to move his eyes.
There is a 6x8 matrix containing the characters. Selecting
the row and column number in two moves the user can tell
the system the character to be printed. This indeed is a very
fast system, but the disadvantages are that the system is
complicated by multi-channel EOG system.
Figure 5:GUI of Different activities used by Desai [10]
Soltani [8] has grouped characters in nine groups of 4
characters each. It is a 3x3 matrix with cursor initially at
the centre. To move the cursor to a desired position the
user is expected to move his eye in the desired direction in
a stipulated time. To select the block the user has to double
blink.
A
group
further
consists
of
nine
characters/commands: the 4 characters and the commands
back, delete, space, clear, dot and is selected in the similar
way as before. Here again two iterations are required to
type a character.
Usakli [7] makes use of a virtual keyboard of size 4x12
without grouping and uses controls left, right, up and down
eye movements to control the cursor. A double blink is used
to select a particular character. The speed achieved with this
system is 12 characters per minute.
Figure 7:Virtual keyboards used by Usakali [11] (a) with
special characters. The message shownat the bottom line is
written in 148 s. (b) P300 speller virtual keyboard. The
lastrow is added to increase efficiency.
Zhang [6] has used a single control and that is a double
blink. The virtual keyboard is not grouped and the cursor
moves through the characters periodically. Initially it starts
from the top left and moves horizontally till a double blink
is detected after which it stops and moves vertically. On
detection of another double blink the character on which the
cursor is present is selected and the cursor returns back to
the top left of the screen.
Figure 8:The user interface based on the on-screen keyboard
in Windows 7 used by Zhang [6]
Figure 6:Grouped keyboard used by Soltani [8]
©http://ijcer.org
e- ISSN: 2278-5795
7. CONCLUSION
The results of this study shows that multichannel
electrodes increase the number of signals but also the
complexity. It is ergonomic to use a single channel
electrode but at the cost of typing speed.The signal
conditioning hardware makes use of mainly 4th order
filters and the system gain is 5000-10000.Use of a
simple algorithm like that of a threshold is not efficient
in case of noise and further processing like wavelet
transform is required. Also feature extraction and
classification could be used to differentiate between the
signals.Use of a multi-channel system increases the
p- ISSN: 2320-9348
Page 30
Anson Bastos, et al International Journal of Computer and Electronics Research [Volume 5, Issue 2, April 2016]
theoreticaltyping speed whereas a single channel one
takes more timebut is reliable, cost effective and user
friendly.
REFERENCES
[1] en.wikipedia.org/wiki/Motor_neuron_disease
[2] mndassociation.org/what-is-mnd/brief-guide-to-mnd
[3] D. Prutchi and M. Norris, Design and Development of Medical
Electronic Instrumentation. Hoboken, NJ : Wiley, 2005, pp.
5–14.
[4] Swami, P.; Gandhi, T.K., "Assistive communication system for
speech disabled patients based on electro-oculogram character
recognition," in Computing for Sustainable Global
Development (INDIACom), 2014 International Conference on
, vol., no., pp.373-376, 5-7 March 2014
[5] Shang-Lin Wu; Lun-De Liao; Shao-Wei Lu; Wei-Ling Jiang;
Shi-An Chen; Chin-Teng Lin, "Controlling a Human–
Computer Interface System With a Novel Classification
Method that Uses Electrooculography Signals," in Biomedical
Engineering, IEEE Transactions on , vol.60, no.8, pp.21332141, Aug. 2013
[6] Ang, A.M.S.; Zhang, Z.G.; Hung, Y.S.; Mak, J.N.F., "A userfriendly wearable single-channel EOG-based human-computer
interface for cursor control," in Neural Engineering (NER),
2015 7th International IEEE/EMBS Conference on , vol., no.,
pp.565-568, 22-24 April 2015
[7] Usakli, A.B.; Gurkan, S., "Design of a Novel Efficient
Human–Computer Interface: An Electrooculagram Based
Virtual Keyboard," in Instrumentation and Measurement, IEEE
Transactions on , vol.59, no.8, pp.2099-2108, Aug. 2010
[8] Soltani, S.; Mahnam, A., "Design of a novel wearable human
computer interface based on electrooculograghy," in Electrical
Engineering (ICEE), 2013 21st Iranian Conference on , vol.,
no., pp.1-5, 14-16 May 2013
[9] Nathan, D.S.; Vinod, A.P.; Thomas, K.P., "An
electrooculogram based assistive communication system with
improved speed and accuracy using multi-directional eye
movements," in Telecommunications and Signal Processing
(TSP), 2012 35th International Conference on , vol., no.,
pp.554-558, 3-4 July 2012
[10] Yash, S.D.,”Natural Eye Movement & its application for
paralyzed patients,” in International Journal of Engineering
Trends and Technology (IJETT) on, vol. no 4 Issue 4- April
2013
[11] Chaudhuri, A.; Dasgupta, A.; Routray, A., "Video & EOG
based investigation of pure saccades in human subjects," in
Intelligent Human Computer Interaction (IHCI), 2012 4th
International Conference on , vol., no., pp.1-6, 27-29 Dec.
2012
[12] Aungsakun; Phinyomark; Phukpattaranont; Limsakul,
“Development of robust electrooculography (EOG)-based
human-computer interface controlled by eight-directional eye
movements,” in International Journal of Physical Sciences on,
Vol. 7(14), pp. 2196 - 2208, 30 March, 2012
[13] A. Hussain, B. Bais, S. A. Samad and S. Farshad Hendi, 2008.
“Novel Data Fusion Approach for Drowsiness Detection.”
Information Technology Journal, 7: 48-55.
©http://ijcer.org
e- ISSN: 2278-5795
p- ISSN: 2320-9348
Page 31
Download