EEGStore: Building a System for Large Scale Analysis of EEGs

advertisement
EEGStore: Building a System for Large Scale
Analysis of EEGs
CS294 Final Project
16 December 2013
Orianna DeMasi
Jordan Kellerstrass
Department of Computer Science
UC Berkeley
odemasi@eecs.berkeley.edu
Department of Computer Science
UC Berkeley
kellerstrass@berkeley.edu
Abstract—Some neuropsychiatric disorders are developed in
childhood and can affect individuals for their entire lives. Emerging research shows that EEG signals can be used for early
diagnosis, which would enable early intervention and possibly
alleviate the severity of afflictions. However, the computing
backend to collect, process, and do machine learning on EEG
data for a large scale study does not currently exist. Without
such a system, medical research in this area cannot progress.
In this paper we present EEGStore, our prototype of the data
collection end of this system, OpenEmo, a novel device for data
collection, and discuss the challenges in implementing the entire
system in a scalable way.
I.
I NTRODUCTION
Neuropsychiatric disorders developed in childhood can
continue through an individuals life, severely affecting that
individual’s happiness, productivity, and ability to support
himself. Some of these conditions can be diagnosed early
and research is showing the potential for early interventions
to radically affect the prognosis of those individuals. For
example, there is a high correlation between children who
contract cerebral malaria to later be diagnosed with autism
in sub-Saharan Africa [1]. If the development of autism could
be better predicted, additional treatment of the malaria and
early interventions could be taken to combat the severity of
the autism.
Additional research shows that the diagnosis and severity
of autism may be predicted by certain EEG signal patterns
[2]. This research is promising, but currently only shown on
a small research population. It is not yet possible to put EEG
monitoring into practice because of the lack of a mobile device
to monitor patients and the compute infrastructure to analyze
the data. Further studies using EEG data within populations
or for other disorders are not yet possible due to the lack of
computing infrastructure to handle incoming data on a large
number of patients.
Until an appropriate mobile device has been developed
for collecting EEG data and sufficient infrastructure has been
developed for handling and processing the data, the use of
EEG for early diagnosis in community health clinics will not
be possible. Further, additional research to study the extent to
which EEG data can be used for diagnosis and treatment of
a myriad of disorders will be stunted and confined to small
scale case studies.
We would like to develop a low power, low cost, mobile
device that can be used in small clinics by community health
workers to routinely collect EEG data on pediatric patients
in Kenya. To support this hardware, we need a data system
pipeline for collecting and storing data on individuals, cleaning
the data, gathering data from various clinics into a population,
and finally applying published techniques for mining the EEG
signals and extracting diagnostic information. Eventually, a
general system will need to be developed that will allow for
the development of novel data mining techniques targeting a
wider spectrum of conditions.
This paper introduces a novel hardware device OpenEmo
and a prototype of the general data system that OpenEmo will
be part of. OpenEmo can collect EEG from a commodity EEG
headset and transmit the data to a mobile phone or table. The
prototype of the data system surrounding OpenEmo is called
EEGStore and is a proof of concept for the scalability, cost,
and general viability of a full system. This paper makes a case
for the full system and shows that such a system is not only
possible but able to satisfy the needs of the problem setting.
The rest of this paper proceeds as follows. Section II
describes the environment that EEGStore is being designed
for and the resulting needs that must be addressed. Section III
describes the target system that EEGStore will be modeled
after. Section IV describes the prototype EEGStore system
that we implemented as well as the OpenEmo device for
data collection. Section V evaluates our prototype system and
discusses its viability as a larger system in the environment
for which it is intended. Section VI reviews previous research
that this project is based on. Section VII and VIII discuss the
lessons that we learned during this project and the future work
that we see as the most pressing. Finally we conclude the paper
in Section IX.
II.
P ROBLEM S ETTING
This project addresses improving EEG data collection and
processing for community health clinics in emerging regions.
In many emerging regions, there are not enough doctors to
Fig. 1. Diagram of data flow in ideal EEGStore system. Data is collected on mobile phones in a field environment and shared with local clinics and remote
research institutions via a web service. Various levels of encryption protect what data gets transferred.
meet the basic medical needs of the population, let alone
specialists for more complex mental health issues. In Ghana
for example, there are fewer than 10 psychiatrists and 2
neurologists for a population of 24 million people [3]. Most of
these few specialists are located in the cities and must prioritize emergencies. Meanwhile patients with neuropsychological
disorders of varying severity will likely go unseen. As such,
care and monitoring are left to individuals and their families
[3]. Early detection and intervention could save people from a
lifetime of disability.
To address the scarcity of medical professionals, many
countries have adopted a highly effective model of community
health care. Trusted community members are trained to be
community health workers (CHWs). They visit people in their
homes to provide basic care and connect them with a health
clinic or hospital when necessary.
Our goal is to build a mobile system that enables CHWs in
resource-poor areas to monitor and diagnose neuropsychological disorders - such as autism and epilepsy - which may be
identifiable via electroencephalography (EEG) recording even
before behavioral signs are evident.
local clinics. These villages could require a long drive for
access over rough roads. Within villages, visits may occur
outside of any permanent structure and will be attended by an
entire family. Often medical equipment is too delicate to take
into these outdoor and potentially sandy settings. In addition
to no permanent enclosure, there is frequently no electricity,
and thus these visits must be fully mobile in that they have
self provided power.
The typical work day varies dramatically for healthcare
providers around the world, but the assumptions we made for
the use case of EEGStore are inspired by the setting that we
expect in Kenya. These assumptions are as follows:
•
•
The EEG device should be robust, which we define as
not easily broken during field work or travel.
Our Android app should collect and process raw EEG
data from any EEG device with a bluetooth dongle
because these devices may be donated or built from
open source hardware such as Open EEG.
More specifically, we will target deploying a study of our
system in Kenya, where this community healthcare model is
in practice. Figure 2 gives an overview of the size of the CHW
population in Kenya and the potential size of the population
that our system must serve.
In Kenya, CHWs will make home visits to patients, where
they actually go to different homes to administer care or
common checkups. Often these visits include visiting small
villages and other locations that are not particularly near to
Fig. 2. This table summarizes the size of the population in Kenya that we will
initially target [4]. It gives an idea of the scale of data that will be collected.
Eventually EEGStore will target an international population.
•
•
Energy use should be minimal as there is no guarantee
of electricity between patient home visits.
Whether being used in a clinic or home visit, the added
time needed to collect an EEG recording should be a
maximum of a few minutes.
We have tried to make these assumptions as general as possible
so that EEGStore will be applicable to many regions outside
of Kenya and able to scale to a larger population.
In the next section we consider the the necessary aspects
that a system must have to be amenable to the above described
conditions. We forth a set of goals that an ideal system must
have to suit this problem setting.
III.
S YSTEM D ESIGN
The ideal system to address some of the needs described
above would make EEG technology accessible to resource
poor areas. To do this, the system would have to be low cost,
energy efficient, mobile, and easily accessible for development
and integration into other systems. To address these needs, we
envision an end to end data collection and processing system.
Figure 1 shows an overview data flow through the desired
system. The system begins with any commodity wireless EEG
headset. Data is collected from the headset via bluetooth onto
a mobile phone or tablet. Data is sent from the mobile device
over the internet to local health clinic servers and to remote
research centers via a scalable web service. Sending data to
multiple locations allows for various levels of processing, e.g.
for different purposes, such as treatment versus research. It
also enables data sharing for specialized cohort studies and for
addressing different levels of data security that will be needed
to share with different institutions.
In addition to the system meeting the above needs, we
would like it to be very modular. We believe that modularity
will add to the robustness of the system by allowing people to
utilize whatever resources they have access to. For example,
we want our system to be independent of a given headset or
phone. Wireless EEG headset devices are continually improving and becoming more easily accessible, even used to control
some off-the-shelf video games [5]. With modularity properly
implemented, our system can be made compatible with any
headset and thus can evolve with the rapidly developing EEG
technology that may become available in the near future.
To summarize, our system goals are the following:
Fig. 3.
•
•
•
•
•
•
A scalable pipeline for collecting and processing EEG
data for entire patient populations.
A low cost implementation of the system end to end
so that it is widely accessible.
An easy interface so that the system is easy to use.
The equipment associated with EEGStore must be
robust to difficult conditions.
All equipment must be self powered and have enough
battery life to last an entire day of home visits.
EEGStore must be compatible with an Android phone,
as these are more prevalent than laptops.
IV.
I MPLEMENTATION
As a prototype for a larger system, we developed EEGStore. EEGStore is a system that would make EEG data more
available for medical treatment and research and opens the
possibility for a myriad of new research opportunities. A
diagram of the implemented system is shown in Figure 3.
A. General framework
One component of EEGStore is the device that collects
the EEG signal from patients. This device is mobile, wireless,
and can be taken into remote field locations for use on rural
or hard to access patients. The next stage of EEGStore is to
pass the data to the mobile phone or tablet. For this stage
we developed a novel device OpenEmo, which is a universal
EEG adapter and will be described in more depth below. From
OpenEmo, the data is passed through a mobile phone (where
some processing and encryption occurs), into the cloud.
B. OpenEmo
As a component of EEGstore, we developed the novel
OpenEmo adapter. OpenEmo can be seen in Figure 5. As
shown in Figure 3, OpenEmo is a critical part of the current
EEGStore system. OpenEmo works as a translator between the
EEG headset and the phone. OpenEmo is a physical device
that reads data from bluetooth dongle supported EEG devices,
decodes it from the headsets proprietary encryption, and sends
it via bluetooth 3.0+HS to a mobile phone or tablet. The
importance of OpenEmo is that many wireless headsets are
forcibly paired with companion dongles, which need to be
plugged into a computer. Laptops and other computers are not
common in resource poor areas, so it would be difficult to
Data flow diagram for prototype of EEGStore. Data is collected via OpenEmo onto a mobile phone. It is then transferred, via DropBox, to servers.
Fig. 5. The OpenEmo Android app is easy to use. With minimal training,
EEG scans can be collected.
Fig. 4. OpenEmo is a universal adapter for connecting dongle supported
EEG headsets with mobile phones and tablets
use any of these dongle supported devices in field studies or
community health clinics.
We envision OpenEmo to be a universal adapter that
connects any dongle supported headset to a phone or tablet.
However, the current prototype of OpenEmo is compatible
with the popular Emotiv EPOC headset [6] and Android
mobile devices. We chose the EPOC as the first headset that
OpenEmo is compatible with, as Emotiv is in contact with
autism researchers and it seems there is the potential that they
might donate headsets to community clinics or a research lab
[7]. The headsets communicate over bluetooth to a dongle,
which cannot be plugged into the phone due to hardware
constraints. Our app is currently written for Android, as this
platform is more prevalent in remote and developing regions,
which our system targets.
The OpenEmo adapter is prototyped with a RaspberryPi
[8] and Bluetooth 3.0+HS. It has three LEDs to indicate its
status. The RaspberryPi runs a Linux based system and the
code base to decrypt the incoming EPOC signal, which is
written in Python, uses previous work that looked at how to
extract data from the Emotiv EPOC [9], [10].
folder and is low cost. It has also proven to be scalable and has
many protocols to check if data has been fully and successfully
transferred. Finally, DropBox was an extremely easy and quick
way to prototype a reliable form of data transmission with
HTTP.
The way we utilized DropBox was to associate an account
with each phone or tablet. This would correspond to each
CHW having an individual account. Each research institution
and clinic would also have a DropBox account. At the end
of a day of home visits or in-clinic checkups, we envision
that each CHW will upload their collected data to their
DropBox account. As their folder will be shared only with
the appropriate institutions, we can control the data flow.
This allows local clinics to receive data on their patients and
research institutions to receive data on all the patients across
the clinics. Further, this only requires paid DropBox accounts
for the research institutions, as we project that free accounts
will be large enough for each CHW (see Section V for more
details).
D. Server prototype
The server side of EEGStore is prototyped with a Macbook
laptop running python. Preliminary functions are written to
move data from DropBox into a file system and do minor
processing of the data, such as visualizations.
V.
C. Network prototype
While the data collection is transmitted over bluetooth, the
data processing and sharing is done over HTTP. Our prototype
of this section of the system is done using DropBox [11]. A
very preliminary implementation was done with direct HTTP
calls to an apache server running PHP on a laptop. While this
worked, the connection was brittle and unable to recover from
connections lost midway through transmission. After these
preliminary tests, we chose to use DropBox as the HTTP
section prototype due to its reputation for fault tolerance and
error checking. DropBox has extremely low latency and is
robust to multiple users accessing the servers at a time. It is
also backed up on multiple remote servers, which decreases the
chance for lost data from individual machine failure. Dropbox
also does not limit the number of users who share a single
E VALUATION
There is no other system comparable to EEGStore. None
other has the end to end capability that EEGStore has. As
a result, it is not clear how to evaluate EEGStore relative to
another system. We cannot say that EEGStore is doing better
than another system, as there is none that targets the same
situation that we target.
In order to display benefit of EEGStore, instead we consider how well it meets the goals of the system that were
set forth is Section III. The elements of the system that we
evaluate are how well EEGStore does on
•
•
•
•
energy consumption
scalability
time to collect samples
system cost.
bluetooth” is the time that it takes to send a 2 minute data file
over bluetooth to the phone. This phase is less than a second
because the file size is on the order of 800KB for the 2 minutes
of data collected with the Emotiv EPOC, see Figure 2. Even
with a more sophisticated headset, one with more channels and
a higher sampling rate, the data file should remain under 8MB
and still transmit in under a second.
A takeaway from Figure 6 is that there is not a strong need
to optimize energy use by replacing the Bluetooth 3.0+HS with
Bluetooth Low Energy. Because the data files are relatively
small, transmitting a data file from the OpenEmo adapter to the
mobile phone is so short that is not relatively energy intensive.
Fig. 6. Energy consumption of the various stages of OpenEmo sampling.
Unsurprisingly, the most energy consumed is during the longest phase of
energy consumption, data collection. In general the energy consumption is
low, which indicates that OpenEmo could successfully be used in the field for
prolonged periods of time.
A. Energy Considerations
In order for EEGStore to be acceptable for deployment in
resource poor areas, the entire data collection portion of the
system must be fully mobile and thus have a self contained
energy source. Further, there is no guarantee of electricity
for recharging between patient home visits or even on a
daily basis. To last for prolonged periods away from energy
sources, the mobile phone, OpenEmo adapter and headset must
be energy efficient. In this section, we discuss the energy
consumption of the Emotiv EPOC, the headset EEGStore is
prototyped with, the mobile phone, and OpenEmo during data
collection. We demonstrate that all three components have
reasonable charging cycles and are suitable for our use case.
1) Energy Breakdown of OpenEmo Adapter: The OpenEmo adapter is prototyped with a RaspberryPi microcontroller,
which is powered via a micro USB port. To investigate
OpenEmo’s power consumption on a deeper level, we used the
Smartronix USB power meter [12] to evaluate its consumption
during each step of data collection and a timer to record the
duration of each phase. We measure the consumption during
each phase of sampling, as it was not possible for us to
measure the energy drain of simultaneous activities separately.
It is likely that the Bluetooth connection from the OpenEmo
adapter to the Android app is using some energy in the
background to simply check whether or not it needs to be
sending or receiving something [13], but this is averaged into
the measured approximate draw during each phase. The results
of this analysis are summarized in Figure 6.
Sampling data from the EEG device was expected and
shown to be the largest energy cost. This is because it takes the
most time and requires the most communication. The sampling
time was set to be 2 minutes, as this the duration of a sample in
an EEG Austim study [2]. The startup time for the OpenEmo
was 59 seconds. It is possible that this time would be reduced
in future generations of OpenEmo and that the sampling time
would be increased for different studies. The decryption from
the headset dongle occurs during the sampling phase, so that
computational cost is included in that measurement. “Sending
Our conclusion is that, overall, OpenEmo’s energy consumption is low. The consumption is low enough that OpenEmo is not the component of the end to end EEGStore system
limiting energy robustness. This is explained further in the next
section.
2) EEG Data Samples per Component per Charge: To
calculate how many days our system will be able to run in the
field on battery power, we assume that a community health
worker may visit up to 15 patients per day and thus take 15
samples in 1 visit-day. This number corresponds to 3 families
of 5 people each and is potentially an over estimate. We assume
that the amount of time needed to collect a patient’s EEG
data is 3 minutes, including a 2 minute recording (based on
the time needed for a prior study [2]) and about a minute
for startup and shutdown. The summary of calculations based
on these assumptions is shown in Figure 7. The purpose of
these calculations is simply to compare the number of samples
each component will be able to collect before needing to
be recharged. We analyze the battery life assuming zero idle
time, and therefore have high error margins. Our measurements
and basic calculations give us enough information to have
confidence that the OpenEmo adapter and mobile EEG device
are not limiting factors for implementing the system in an
energy constrained environments.
The OpenEmo adapter comes with a 4,400 mAh (5V at
1A) battery. We calculated the process of starting up the
RaspberryPi, collecting EEG data from the headset, sending
it in a file over bluetooth to the Android phone, and shutting
down the RaspberryPi takes about 50 mAh (1A is drawn from
the battery for about 3 minutes). Therefore, under perfectly
efficient conditions, 88 samples, or 5.89 visit-days of 15
patients per day, could be taken before the OpenEmo needs
to be recharged.
The Emotiv EPOC has a built-in Lithium Ion battery that
Emotiv claims will run for 12 hours of continuous use [6].
Twelve hours divided by 3 minutes is 240 samples before needing to be recharged, again assuming near perfect efficiency,
which means the device is not left on idle for long periods of
time.
The Android devices we analyzed were the Nexus 4
and Nexus 7. We chose these devices as they are Android
compatible, more prevalent in developing regions, and Google
has shown interest in donating phones and tablets. As a result,
these two devices are representative of devices that would be
used by health workers for more than patient data collection.
Community health workers enjoy having phones as a means
of accessing educational material and keeping in touch with
other health workers and clinics in their area. At the same
time, the phone runs multiple processes simultaneously. Even
if it were possible to calculate the exact energy cost of
OpenEmo application processes, it is not realistic to assume it
is running an isolated process. For these reasons, the error bar
on this analysis is high and would result in more significantly
frequent charging than the other components calculated with
ideal efficiency. Another major determinant is screen size and
brightness.
That said, the following is how we roughly determined the
perfect case charging frequency for the Nexus 4 and Nexus
7. We used an app called Battery Doctor (Battery Saver)1 to
confirm that a full Nexus 7 battery could power 1,220 minutes
in idle mode, which is equal to about 600 minutes of Bluetooth
communication or reading.
For one EEG sample collected with OpenEmo, the app
spends about 65% of its time using Bluetooth and the other
45% of its time open for other tasks, which is closest in
energy drain to Battery Doctor’s interpretation of reading a backlit screen with some processing of input from the touch
screen. Therefore, 65% of the time while OpenEmo is running
Bluetooth and reading the battery is draining twice as fast as
reading alone, which is draining the battery about twice as fast
as the idle state.
1,220 minutes of idle divided in 65% and 45% portions is
equal to 793 minutes and 549 minutes respectively. 793 minutes divided by 4 because the battery is draining 4 times that of
idle state while the app is communicating over Bluetooth and
being open for the user to input information, is 198 minutes.
549 minutes divided by 2 because the battery is draining twice
as fast as the idle state for the remaining 45% of the time, is
274 minutes. For our 3 minute case scenario, the OpenEmo app
could run on repeat for 157 samples. The same calculations
repeated for the Nexus 4 phone results in 76 samples. Figure
5 assumes 75% would be maximum efficiency, although this
may even be too high.
Based on these calculations, we conclude that the phone or
tablet, not the OpenEmo adapter would be the limiting factor
for battery life in the field. The OpenEmo adapter and Android
device are powered via USB, and would therefore be easy
to charge with backup external battery packs which are also
increasingly inexpensive.
B. Data Scale
The initial deployment of EEGStore targets Kenya. The
authors of a previous study [2] have expressed interest in expanding their study to this region, but they have also expressed
interest in eventually scaling up to international proportions.
As such, we first consider the size of data that we could expect
to see for our initial Kenya deployment and whether EEGStore
will support this scale.
Figure 2 gives an indication of the size of the population
that EEGStore needs to accommodate. It is estimated that
there are 15,000 CHWs in Kenya and that each visits a
population of 50-500 patients [4]. At this scale, we would
Fig. 7. The number of EEG samples that each of the mobile devices in the
system can take before needing to be recharged. A visit-day is assumed to be
15 visits per day and is representative of how many visits would be made in
a typical setting. Note - OpenEmo is not the limiting factor when a mobile
phone is used.
like to accommodate the data for, at most, 7,500,000 patients.
The EEG files we collect are relatively small. They are simple
time series recorded in basic text files. The size of these files
for the 14 channel EPOC sampling at 128Hz for 2 minutes
is about 860KB. We estimate that if a more sophisticated 64
channel device that samples at up to 250Hz were used with
OpenEmo instead, then the EEG file size could be as much as
7.7MB. Thus, each CHW will collect at most 3.85 GB of data
from their patient population. As long as data is transferred to
local and remote servers in a timely fashion, this size of data
could fit into a free 2GB DropBox account. It does imply that
a local clinic collecting the more fine grain studies would have
to invest in a 50GB DropBox for $99 per year. However, as a
CHW will visit at most 15 patients a day, they will collect at
most 115.5MB of data per day. As long as data is harvested
daily from the clinic’s DropBox folder and stored in more
permanent storage, then free DropBox accounts are feasible
both for the local servers as well as the phones.
C. Time
Due to it’s mobility and the size of data collected, EEGStore extremely time efficient. The transfer of data between
OpenEmo and the phone is under a second. The time to
upload a single file to DropBox is also under a second,
on our high speed internet connections. The time to upload
files to DropBox, and thus transfer them from the phones to
remote research servers, will be longer on the slower internet
connections that we expect to encounter in clinics. However,
due to the small size of the files, this should still not be a
limiting cost. It is also less critical as uploading can be done
when the phone is plugged into a more stable power source. It
definitely seems like the longest component of the system will
be data sampling phase, which is entirely study dependent.
D. Cost
1 Battery
Doctor Android app has an average rating of 5 out of 5 star by
183,584 users at the time of our download. Explanation of how remaining
battery life for each task is calculated is not published.
The EEGStore system is one of many puzzle pieces that
could help save people from a lifetime of disability by enabling
TierStore is a filesystem that implements a variety of
features for the sake of effectiveness in developing regions
with challenging network environments [15]. Some of the
functionality is similar to DropBox. EEGStore is prototyped
with Dropbox because it provides some of these same helpful
features, especially delay and fault tolerance.
Fig. 8. Cost of each component of the mobile section of EEGStore indicates
the OpenEmo is an affordable component. Note that both headsets and phones
will potentially be donated by Emotiv and Google, which is why we chose
these specific devices.
early intervention. Neuropsychological disorders are a painful
human experience as well as expensive for healthcare systems.
The regions of the world that stand to gain the most from the
ability to monitor and diagnose mental health are resource poor
regions lacking an adequate number of specialists. For these
reason, we suggest that a few hundred dollars is a reasonable
budget for all necessary technological components. See Figure
8 for more details. The OpenEmo adapter was prototyped
for less than $75, and could be produced even cheaper in
bulk. Smartphone and EEG device companies would likely
be willing to donate some number of devices because of the
potential scale if initial deployments are successful.
VI.
R ELATED W ORK
While there is no system quite like EEGStore or device like
OpenEmo, there is myriad previous work that is very related.
In this section we highlight some of the most pertinent projects
related to data systems for information and communication for
development (ICTD), mobile medical devices, and open source
EEG.
A. ICTD Data Systems
Open Data Kit (ODK) is an open source data collection
system that enables users to quickly generate surveys and
begin collecting data [14]. It consists of a web application
for designing forms, an accompanying Android app called
ODK Collect to use the forms, and a cloud storage component. EEGStore also has an Android application and cloud
component. Similarly to EEGStore, ODK is optimized for use
in challenging environments, including intermittent internet
connectivity, which is accomplished by allowing delayed data
synchronization until connected or until the user chooses. ODK
was designed with the need for data collection in developing
regions.
ODK is designed to be generalized whereas EEGStore is
highly specialized for collecting EEG data. ODK is in the
process of implementing support for sensor data collection,
but this feature has not yet been released for production. A
beta version of ODK Sensor was released on 23 November
2013. Another difference is that a user of ODK has the option
to implement Aggregate, server side component anywhere they
like or not at all. To best serve the field of medical research,
it is preferable that if a user is willing to share their deidentified EEG data for research purposes, it should be sent to
one common international database, according to William Bosl
[7]. ODK may have been an alternative tool for prototyping
EEGStore, but it alone would not an alternative to the system
itself.
OpenMRS is the world’s leading open source medical
record system platform [16]. Ideally, it would be on the receiving end of secure patient data collected through the EEGStore
system. The mission of OpenMRS is to improve health care
delivery in resource-constrained environments by coordinating
a global community that creates a robust, scalable, user-driven,
open source medical record system platform. EEGStore also
aims to be a robust, scalable, open source system for the
betterment of global health. Ideally, the EEGStore system will
contribute data to an OpenMRS table structure so as to include
EEG data in OpenMRS medical records.
Another notable project in the data area of ICTD is Sensing
Atmosphere, where researchers collected geographically distributed air quality data by equipping citizens’ basic phones
with inexpensive air quality sensors [17]. A common way
to get ICTD wrong is by excluding local knowledge. Like
Sensing Atmosphere, which is a manifestation of “participatory
urbanism”, EEGStore focuses on enabling people living in
developing regions to make a great contribution to larger global
issues. By contributing data, participants of EEGStore will
contribute to the study of mental health. We have been told
that by providing this contribution to a larger goal, people will
be even more excited about participating in EEGStore [7].
B. Mobile Medical Devices
Low cost mobile medical devices designed for use in
emerging regions is a growing area of research. Two of many
examples we found to be related to EEGStore and OpenEmo
are the Berkeley Tricorder [18]and PartoPen [19].
The Berkeley Tricorder is wireless health monitoring device that stands in contrast to typically large and expensive
technology for measuring vital signs [18]. This device may
be worn on a part of the body and sends sensor data over
Bluetooth to an iOS application. The Berkeley Tricorder incorporates multiple sensing options, although EEG is not one
of them. This device is meant for personal self monitoring
more than it is by a health worker.
The PartoPen is a technology that makes simple partograph
data easier for minimally trained birth attendants to monitor
the birth process and avoid obstructed labor which is major preventable cause of death in developing countries [19].
EEGStore also aims to make data available for health worker
decision support. The PartoPen reminds health workers when
to revisit patients, which may be a beneficial feature to add to
the OpenEmo Android app.
C. Open source EEG
There is growing interest in the power of EEG signals and
the information that can be extracted. The open source project
OpenEEG is a consolidation of some of the efforts that are
going into making EEG signals accessible to more than just
elite medical experts [20]. The OpenEEG project focuses on
hardware and instructions for how individuals can make their
own EEG. They also include links to other software projects
for mining EEGs.
Many open source projects are also arising for mining
and analyzing EEG signals. These projects include EEGLAB
[21] and FieldTrip [22], which are opensource EEG Matlab
toolboxs. PyEEG is a similar toolbox for Python [23].
VII.
L ESSONS L EARNED
An international database for collecting EEG samples from
a variety of devices has a significant amount of value for
medical researchers. If there were a simple approach to implementing this end to end system, it would have been done.
We find it useful to document our major hurdles here. Although
initially deterred from building EEGStore from ODK, because
of its newly developing support for sensor data, in hindsight,
we may have tried to use more ready made open source
components than we did.
One question that is important to consider is how to verify
that the data coming from the relatively inexpensive EEG
device is valid and not noise. Complicated solutions to this
issue are expanded upon in Section VIII. For the OpenEmo
and EEGStore prototypes we are able to compare plots of
data collected when the Emotiv EPOC is not on a person’s
head (noise) to plots of data collected when the device is
situated properly on a person’s head. The difference is clear,
but we do not claim that this is good enough for use in the
medical field. Most commodity EEG devices generate a timeseries of the sensor data along with a signal strength value.
One solution is to only accept data into a recording if the
signal strength is higher than a certain threshold at which the
EEG device manufacturers promise that the data is of a certain
quality. Deeper investigation of the methods used to verify
signal strength would be necessary to determine if this simple
solution is good enough. Further testing could reveal how often
the signal strength is acceptable. We chose to focus more on
the flow of data than we did the quality of the data, but both
are vital components.
Finally, knowing that energy was one of our constraining
factors, we attempted to build a simple Android application
using Bluetooth Low Energy (BLE) instead of Bluetooth 3.0.
This was challenging for us as beginning Android developers
because BLE’s use with Android it is only recently and not
widely supported or documented. Adequate documentation
was lacking for necessary low-level programming in a variety
of dependencies. This is not ideal for any open source project.
After analyzing our energy calculations, we realized that this
was a premature optimization. BLE uses significantly less
energy for an insignificant amount of time and significantly
increased difficulty of development.
VIII.
F UTURE W ORK
A. Server development
We would like to further develop our server setting.
While we currently have data processing and a full data
flow established, we need to establish a database system and
integrate incoming data into it. We need more efficient storage
mechanisms and to better separate survey data from raw EEG
data, while not losing connections between data components.
B. Data processing and analytics
We would like to implement further data processing on
incoming data. This is important not only for future medically
diagnostic machine learning algorithms, but also for assessing
data quality in a timely manner. If data quality is established
quickly enough, then poorly collected samples can quickly
be retaken. We believe that more data processing must occur
both on OpenEmo as well as on the server. Further processing on OpenEmo is of particular importance so that onsite
assessments of data quality can be made. Such processing
can incorporate discrete Fourier transformations to ensure
that characteristic wavelengths occur at appropriate sensors.
Strength of signal can also be address from sensor voltage
reading.
C. OpenEmo compatibility with more headsets
While OpenEmo was prototyped with the Emotiv EPOC
headset, we would like to make it compatible with more
headsets. The extension will also inspire how we formulate the
server side support as different headsets have different numbers
of channels and thus collect different data. How to adapt the
Android application to address which sensors are being used
is not trivial. Adjusting the system and database storage to
account for a variety of systems could also be an interesting
problem.
D. OpenEmo refinement
We would like to increase the functionality and appearance
of OpenEmo. This includes making a smaller enclosure, more
stable bluetooth connect, and greater app functionality. One
of the most immediate actions would be to put in a DropBox
Drop-in for sending the data directly from the OpenEmo app
to the remote Dropbox servers. Currently this functionality is
underdevelopment at DropBox for Android. We will incorporate it as soon as it is available.
We would also like to revisit the hardware decisions for
OpenEmo. The current ARM processor is rather powerful and
it may be possible to change to a different ARM processor
architecture to decrease power consumption. One trade off
that we will have to consider is the ease of development.
RaspberryPi is well established and continuing to use it would
enable external developers to contribute modules or drivers
to adapt new headsets to our system. Basing OpenEmo on a
different architecture could hinder its accessibility to outside
contributors.
E. Data encryption and security
A very interesting problem that we did not have time to
consider was that of data security. We would like to study
the cost and need for various levels of data security between
devices and levels in the system. This concept will probably
be of critical importance when we consider moving the data
between the phone and remote servers. How to move the data
in a secure, HIPAA compliant, and energy efficient way is
a very interesting issue that we would like to address. Most
likely encryption will be needed at different levels, but how to
manage the keys and where the encryption should occur, e.g.
on the OpenEmo or on the phone, is an issue that we would
like to delve into.
F. Replacement of cloud storage
While our initial prototype uses DropBox to transfer data
between the mobile phone and servers, we would like to
replace this with a more secure method for cloud storage and
data transfer. We need a HIPAA compliant alternative that still
offers all the benefits of dropbox - fault tolerance, backup,
versioning, etc. Alternative cloud storage systems exist, such
as Google Drive and Box [24], but it is not clear that they
will be sufficient for our needs or if we need to build our own
transfer system.
[3]
[4]
[5]
[6]
[7]
G. Data quality
We would like to incorporate more use of analytics for
data quality during the visit. Metrics such as signal strength
are often available with EEG headsets, including the Emotiv
EPOC that we prototyped for. This data has not been included
in our app and has not been used to censor data that is not
of appropriate quality. We would like to look further into
developing quality and efficient metrics for assessing data
signal quality and then how to include this information on
the server for data processing.
H. Integrate with other systems
Having a data collection system is only as good as the
functionality that it can connect to. We would like to connect
EEGStore with other extant systems to provide further data
analysis and integration into medical records. A few such
systems are EEGLAB [21], FieldTrip [22], PyEEG [23], and
OpenMRS [16]. Both EEGLAB and FieldTrip are well established, expect data from headsets that have more channels than
many commodity headsets that we are looking to utilize. As a
result, we would like to find a way to utilize the functionality
of these two toolboxes from the variety of headsets that our
system will be compatible with. Similar challenges will arise
with the PyEEG toolbox. We would also like to integrate our
EEG readings into OpenMRS, which is quickly being adopted
as the only option for electronic medical health records in
developing regions. Ideally, EEGStore would be compatible
with OpenMRS so that EEG could become a more regular
diagnostic tool.
IX.
C ONCLUSIONS
We have created a prototype system of EEGStore. This
system collects, stores, and processes raw EEG data from mobile headsets. We have also developed the OpenEmo adapter
to collect data from dongle supported EEG headsets, dencrypt,
and send the raw data to an Android phone or tablet. The hope
is that the end to end system will help make EEG technology
and data available for improved healthcare. While there is still
much development needed in our system, we have presented
some analysis that shows our prototype could be viable for the
situation it targets and that further work is very promising.
R EFERENCES
[1]
[2]
R. Idro, A. Kakooza-Mwesige, S. Balyejjussa, G. Mirembe, C. Mugasha, J. Tugumisirize, and J. Byarugaba, “Severe neurological sequelae
and behaviour problems after cerebral malaria in ugandan children,”
BMC research notes, vol. 3, no. 1, p. 104, 2010.
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
W. Bosl, A. Tierney, H. Tager-Flusberg, and C. Nelson, “Eeg complexity
as a biomarker for autism spectrum disorder risk,” BMC medicine,
vol. 9, no. 1, p. 18, 2011.
“Personal correspondence with Dr. Julius Awakame, founder & president. the west african health informatics fellowship program, staff
grade psychiatrist, cygnet hospital wyke, uk. julius[at]wahifp.org,”
http://www.wahifp.org, November 2013.
“The global fund to fight aids tuberculosis and malaria kenya
malaria proposal round 10,” (accessed July 17, 2011). [Online]. Available: http://www.theglobalfund.org/grantdocuments/KENR10-ML PROPOSAL 0 en
“Neurosky: Brainwave sensors for everyone,” http://neurosky.com/, accessed: 15 Dec. 2013.
“Emotiv epoc,” http://www.emotiv.com/epoc/, accessed: 15 Dec. 2013.
“Personal correspondence with William Bosl PhD. director and assistant professor of health informatics, university of san francisco.
wjbosl[at]usfca.edu,” November 2013.
“Raspberrypi,” http://www.raspberrypi.org, accessed: 15 Dec. 2013.
“python-emotiv github repository,” https://github.com/ozancaglayan/pythonemotiv, accessed: 15 Dec. 2013.
“Emokit github repository,” https://github.com/openyou/emokit, accessed: 15 Dec. 2013.
“Dropbox.com,” http://www.dropbox.com, accessed: 15 Dec. 2013.
“Smartronix usb power meter,” http://www.cyberguys.com/productdetails/?productid=76178, accessed: 15 Dec. 2013.
R. Heydon, Bluetooth Low Energy: The Developer’s Handbook, 2012.
“Open data kit,” https://opendatakit.org, accessed: 15 Dec. 2013.
M. J. Demmer, B. Du, and E. A. Brewer, “Tierstore: A distributed
filesystem for challenged networks in developing regions.” in FAST,
vol. 8, 2008, pp. 1–14.
“Openmrs,” http://openmrs.org/, accessed: 15 Dec. 2013.
E. Paulos, R. Honicky, and E. Goodman, “Sensing atmosphere,”
Human-Computer Interaction Institute, p. 203, 2007.
R. Naima and J. F. Canny, “The berkeley tricorder: wireless health
monitoring,” in Wireless Health 2010. ACM, 2010, pp. 212–213.
H. Underwood, “Partopen: Enhancing the partograph with digital pen
technology,” in CHI’12 Extended Abstracts on Human Factors in
Computing Systems. ACM, 2012, pp. 1393–1398.
“Openeeg - eeg for the rest of us,” http://openeeg.sourceforge.net/doc/,
accessed: 15 Dec. 2013.
A. Delorme and S. Makeig, “Eeglab: an open source toolbox for
analysis of single-trial eeg dynamics including independent component
analysis,” Journal of neuroscience methods, vol. 134, no. 1, pp. 9–21,
2004.
R. Oostenveld, P. Fries, E. Maris, and J.-M. Schoffelen, “Fieldtrip: open
source software for advanced analysis of meg, eeg, and invasive electrophysiological data,” Computational intelligence and neuroscience, vol.
2011, p. 1, 2011.
F. S. Bao, X. Liu, and C. Zhang, “Pyeeg: an open source python
module for eeg/meg feature extraction,” Computational intelligence and
neuroscience, vol. 2011, 2011.
“Box.com,” https://support.box.com/hc/en-us, accessed: 15 Dec. 2013.
Download