A Hierarchical Architecture for Adaptive Brain-Computer Interfacing

advertisement
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence
A Hierarchical Architecture for Adaptive Brain-Computer Interfacing
Mike Chung1 , Willy Cheung1 , Reinhold Scherer2 and Rajesh P. N. Rao1
Computer Science & Engineering, University of Washington, Seattle, USA
2
Institute for Knowledge Discovery, Graz University of Technology, Graz, Austria
1
{mjyc, wllychng, rao}@cs.washington.edu, 2 {Reinhold.Scherer}@tugraz.at
1
Abstract
However, due to its non-stationarity, inherent variability, and
low signal-to-noise ratio, a reliable translation of EEG into
appropriate control messages for devices can be difficult and
slow. Therefore, EEG signals have often been used to select a
task that can be semi-autonomously performed by an application (e.g., control of a humanoid robot in [Bell et al., 2008]).
Invasive BCIs, on the other hand, offer higher bandwidth and
allow fine-grained control of robotic devices (e.g., [Velliste et
al., 2008]), but moment-by-moment control over long periods
of time can place a high cognitive load on the user.
Brain-computer interfaces (BCIs) allow a user to
directly control devices such as cursors and robots
using brain signals. Non-invasive BCIs, e.g., those
based on electroencephalographic (EEG) signals
recorded from the scalp, suffer from low signalto-noise ratio which limits the bandwidth of control. Invasive BCIs allow fine-grained control but
can leave users exhausted since control is typically exerted on a moment-by-moment basis. In
this paper, we address these problems by proposing
a new adaptive hierarchical architecture for braincomputer interfacing. The approach allows a user
to teach the BCI new skills on-the-fly; these learned
skills are later invoked directly as high-level commands, relieving the user of tedious low-level control. We report results from four subjects who used
a hierarchical EEG-based BCI to successfully train
and control a humanoid robot in a virtual home environment. Gaussian processes were used for learning high-level commands, allowing a BCI to switch
between autonomous and user-guided modes based
on the current estimate of uncertainty. We also report the first instance of multi-tasking in a BCI, involving simultaneous control of two different devices by a single user. Our results suggest that hierarchical BCIs can provide a flexible and robust
way of controlling complex robotic devices in realworld environments.
1
To overcome these problems, we propose an adaptive hierarchical architecture for brain-computer interfacing which
allows the user to teach the system new and useful tasks on
an ongoing basis: low-level actions are first learned and later
semi-autonomously executed using a higher-level command
(e.g., the command ”Go to kitchen“ for a semi-autonomous
mobile robot). Such higher-level control frees the the user
from having to engage in tedious moment-by-moment control once a command has been learned.
In addition, we introduce, to our knowledge, the first use
of uncertainty for guiding a BCI’s behavior during hierarchical control. We use Gaussian processes (GPs) for learning
high-level commands and exploit the fact that they provide
a measure of uncertainty in their output [Rasmussen, 2004].
When the uncertainty in a given region of task space is too
high (e.g., due to lack of training in that part of the space), the
BCI switches to user control for further guidance rather than
continuing to execute the unreliable and potentially dangerous high-level command. Such uncertainty-guided decision
making is critical for real-world BCI applications, such as
BCI-control of a robotic wheelchair or assistive robot, where
user safety and the safety of those around the robot are of
paramount importance.
Introduction
Brain-computer interfaces (BCIs) have received considerable
attention in recent years due to their novel hands-free mode
of interaction with the environment [Rao and Scherer, 2010;
Scherer et al., 2008; Faller et al., 2010]. In particular, the
field has seen rapid growth due to its potential for offering a
new means of control for devices tailored to severely disabled
and paralyzed people: examples include directing the motion
of a motorized wheelchair, controlling a semiautonomous assistive robot, and using a neuroprosthesis [Galán et al., 2008;
Bell et al., 2008; Müller-Putz et al., 2005].
The most commonly used brain signal source for noninvasive BCIs in humans is the electroencephalogram (EEG).
We present results from user studies involving four human
subjects who successfully taught and controlled a humanoid
assistive robot in a simulated home environment. We also report the first example of multi-tasking in a BCI, where a user
was able to simultaneously control two different devices. Our
results provide a proof-of-concept demonstration that hierarchical BCIs may offer a scalable and robust approach to controlling complex robotic devices in real-world environments.
1647
Top Menu
Train
Test
Train Menu
Existing
Skill Menu
Cancel
(w/confirm)
Robot Interruption:
Success / Fail
Navigation, Exit
Save,
Do not save
(w/confirm)
New
Navigation
Menu
Stop
Learning
Menu
Figure 1: A Hierarchical BCI System. A. Experimental
setup: User selects from a menu shown on a monitor while
a view of the robot in its environment is shown in a larger
immersive setting above, B. Simulated robot in its environment: The robot is a Fujitsu HOAP-2 humanoid simulated
using the Webots software, C. A screen shot of the menu and
SSVEP stimulation, D. Frequency domain representation of
a subject’s EEG signal illustrating a high SSVEP response to
15Hz stimulation.
2
2.1
Force Exit
Exit
Test Menu
AutoNavigation
Robot Interruption:
Low confidence,
Query for user guidance
Guide
Robot
Suggestion
Menu
Figure 2: Overview of control flow in the hierarchical
menu system.
three different options (12 Hz, 15 Hz, and 20 Hz) could be
presented to the user in any given menu.
Continuous EEG was recorded bipolarly from gold electrodes placed at electrode positions Oz and Cz (ground was
linked to Cz), notch filtered at 60 Hz and digitized at 256 Hz
(gUSBamp, Guger Technologies, Graz, Austria).
To detect the flashing stimulus the user was focusing on,
the power spectrum was estimated using the Fast Fourier
Transform (FFT). FFT was applied to 1s segments of EEG
data (Hamming window) every 0.5s and the power for each
frequency was then calculated using squared values. The data
used for final classification was a 4-second average of these
power values (calculated from 8 FFT values). The frequency
with the highest power among the three target frequencies of
12, 15, and 20Hz was classified as the user’s choice for that
decision period (see Figure 1.D for an example).
The BCI menu on the computer monitor and a video projection of the robot simulator were placed one above the other
(Figure 1.A). When users desired BCI control, they focused
on the monitor, while at other times, they watched the robot
move in its environment. When users were not focusing on
the BCI menu, the power in the recorded EEG channel was
markedly different, allowing a simple threshold-based detector to self-initiate the SSVEP-BCI whenever the user required
control.
Methods
A Hierarchical Architecture for BCI
The hierarchical BCI proposed in this paper is composed of
three main components: (A) an EEG-based BCI; we used
a steady state visual evoked potential (SSVEP) based BCI
[Müller-Putz and Pfurtscheller, 2007], but other commonly
used EEG responses such as P300 or mental imagery could
also be used. We used SSVEPs because they offer relatively
high information transfer rates (ITR) with minimal user training: (B) a hierarchical menu and learning system that allows
the user to teach the BCI new skills, and (C) the application,
which, in the present case, is a simulation of a humanoid robot
in a home environment that mimics the physics of the real
world (Figure 1.A and 1.B). The three components interact
closely to make the system work. In particular, the hierarchical adaptive menu system displays available commands as
flashing stimuli for the user to choose using the SSVEP-based
BCI. The user makes the desired selection by focusing on the
desired command in the menu (Figure 1.C). The BCI detects
the option the user is focusing on and sends its classification
output to the hierarchical menu system, which in turn sends
a command to the robot and switches to the next appropriate
menu. The robot executes the command it receives, which
can be either a lower-level command such as turn left/right
or a higher-level learned command. Finally, the user closes
the control loop by observing the simulated robot’s action and
making the next desired selection based on the updated menu.
We describe each of the components of the hierarchical
BCI system in more detail below.
Hierarchical Adaptive Menu
The hierarchical menu (Figure 2) is the interface the subject
uses to interact with the hierarchical learning system. It displays the available commands for the hierarchical learning
system, which are selected using SSVEP. The top-level menu
presents two options: ‘Train’ and ‘Test’.
Selecting ‘Train’ allows the user to either teach the system
a new task (‘new’ option) or update an existing one (‘existing’ option). If ‘new’ is selected, the next menu presented is
the robot navigation menu. If ‘existing’ is selected, the user
must choose a task to update before the navigation menu is
displayed (see Figure 2). In navigating the robot, the user
SSVEP-based BCI
Flickering stimuli used to elicit SSVEPs were presented on
a TFT computer screen with a refresh rate of 60 Hz. Up to
1648
has three choices: left, right, and a stop option indicating the
user is done with the task. To continue moving forward in
the current direction, the user need not make a choice. When
‘stop’ is selected, a menu offers the user the option of saving the task for inclusion in the training dataset for learning
the corresponding high-level command (see below). In order
to mitigate the effects of erroneous classifications, the system
includes various confirmation menus, giving users the ability
to verify or correct their last choice.
Selecting ‘Test’ allows the user to select a task that was
previously learned by the system. After the user has demonstrated and saved examples of a task, the BCI learns the task
(using a learning algorithm such as a Gaussian process (GP);
see next section) and the system incorporates this task into
the hierarchical menu as a new option in the ‘Test’ menu. The
user can now simply select the task as a high-level command,
and be at ease (or perform another BCI task) while the robot
autonomously performs the task.
For allowing the BCI to make decisions based on current
uncertainty in the learned model, we included one additional
menu. While robot is performing a selected high-level command and enters a region where the output of the learned
model has high uncertainty (e.g., high variance in the case of a
GP), the BCI interrupts and displays a menu with two options:
‘guide’ and ‘exit’. The user can select the ‘guide’ command
to guide the robot to the desired destination (thereby generating more training data), or return to the top-level menu by
selecting the ‘exit’ option.
We also conducted a preliminary study with two new multitasking options added to the ‘Auto-Navigation’ menu (see
Figure 2). While robot is executing a high-level command,
the user can choose one of these two multi-tasking options,
currently mapped to turning an overhead light on/off on the
right and left side respectively of the simulated home environment. Note that these multi-tasking options could be mapped
to any of a set of actions that could help the user achieve a desired goal faster. Such multi-tasking effectively expands the
bandwidth of control through the use of a hierarchy.
The hierarchical BCI differs from traditional BCI systems
in its ability to learn new behaviors from user demonstrations.
Learning occurs in the robotic component of the BCI system, and is then abstracted into the hierarchical menu system.
In the current implementation, we utilized a simple positionbased approach to learning to navigate in the home environment; other robotic devices such as prosthetics will require
more sophisticated methods for learning new skills.
In our experiments, we tested two learning algorithms: radial basis function (RBF) networks [The MathWorks, Inc.,
2010] and Gaussian processes (GP) [Rasmussen, 2004]. Using a simulated on-board GPS sensor, the robot’s position
data was logged at a sampling rate of 0.5hz as the user guided
the robot to a desired location. When the user subsequently
commands the robot to learn the demonstrated navigation
skill, this training data was used to learn mapping from position data to global navigation direction. The logged data and
learned model are stored locally, and the user can update a
selected skill with more demonstrations as needed, improving performance over time. This arrangement also allows
training over multiple days. We used off-the-shelf software
packages for learning the RBF and GP models. RBF models were learned using the “newgrnn” function in Matlab’s
Neural Networks Toolbox. For GPs, the GPML Matlab Code
package [The Gaussian Processes Web Site, 2011] was used
with isotropic squared exponential covariance function and
Gaussian likelihood function with standard deviation of the
noise hyperparameter set to 0.1.
During execution of a high-level command, the robot
queries the BCI for navigation direction based on the current
position. If the GP model is used, one obtains both a predicted mean value as well as the variance of the prediction.
This variance can be related to the “confidence” of the BCI in
the learned model: high variance implies low confidence in
the predicted navigational command and vice versa. For the
current implementation, we used a simple threshold to decide
when the robot should ask the user for the guidance based on
this confidence metric.
Robot Application
Bell et al. demonstrated a BCI for high-level control of a Fujitsu HOAP-2 humanoid robot [Bell et al., 2008] but their
work involved a fixed set of high-level commands. We used
the Webots simulator [Cyberbotics Ltd., 2010] to simulate
the HOAP-2 robot in a simple home environment. Note that
rather than representing an animation of the robot, the Webots software simulates the physical dynamics of the Fujitsu
HOAP-2 robot as well as the environment; this facilitates the
transition of the results to the real-world.
The simulated robot was pre-programmed with a basic set
of routines to walk forward, turn right, turn left, and make
smooth transitions from one motion to another motion. The
robot was also programmed with a simple collision avoidance
behavior to keep the robot from walking into a wall or other
obstacles during navigation. Given these basic navigational
routines, we developed a controller for robot navigation, with
the user having a birds-eye view of the robot. The robot was
always in motion unless stopped by the user or the collision
avoidance behavior.
3
3.1
Experiments and Results
Study I: Testing the Hierarchical Architecture
Four healthy, able-bodied individuals participated in the first
set of experiments designed to test the hierarchical BCI architecture (all male, age ranging from 20-30). None of the
subjects had any prior experience with the hierarchical BCI
system. One subject had participated in SSVEP-based BCI
experiments in the past. All subjects read and signed a consent form approved by UW Human Subjects Division.
The first set of experiments used RBF networks for learning. First, preliminary trials consisting of only the SSVEP
portion were run, lasting about 10 minutes; this was done to
familiarize subjects with flashing stimuli, and to allow us to
perform initial analysis to characterize each subject’s SSVEP
response. Subjects then used the entire system to navigate
the robot from its initial position (lower-left corner) to an
assigned goal position (lower-right corner) using low-level
commands (left/right/stop). In the test phase, they were asked
to reproduce the same task but using the high-level command
1649
Navigation traces and policy
Table 1: Performance Comparison
Low-level BCI
Hierarchical BCI
Mean among four subjects (std)
Num Selections
20 (7)
5 (2)
Task Time (s)
220 (67)
112 (25)
Nav Time (s)
124 (37)
73 (19)
Mean of three trials from best subject (std)
Low-level BCIs Hierarchical BCIs
Num Selections
15 (5)
4 (1)
Task Time (s)
141 (42)
85 (4)
Nav Time (s)
99 (30)
74 (9)
Minimum
Low-level BCIs Hierarchical BCIs
Num Selections
8
4
Task Time (s)
91
75
Nav Time (s)
59
59
User
Robot
User Endpoint
Robot Goal
Learned Policy
learned by the hierarchical learning system. This took about
20-30 minutes on average for the subjects. We additionally
conducted a more extensive experimental session with the
best subject from our first set of experiments, where he was
asked to perform the navigation task three times using lowlevel control and three times using the high-level command.
To compare performance of the hierarchical BCI to the
low-level-only BCI, we employed three metrics (Table 1):
cognitive load, measured by the number of commands the
user had to issue to achieve a given task (‘Num Selections’
or number of selections); the time taken to complete the task
(‘Task Time’); and the time spent only on controlling the
robot (‘Nav Time’).
3.2
Figure 3: Example Robot Trajectories from UserDemonstrated Low-Level Control and Hierarchical Control. The dashed trajectories represent low-level navigational
control by the user. These trajectories were used to train
an RBF neural network. The solid trajectories represent autonomous navigation by the robot using the learned RBF network after selection of the corresponding high-level command by the user. The small arrows indicate the vector field
learned by the RBF network (‘Learned Policy’) based on the
user’s demonstrated trajectories.
Study I: Results
All four subjects were able to use the hierarchical BCI to
complete the assigned tasks. The average SSVEP-based 3class accuracy for the four subjects from the preliminary
set of trials was 77.5% (standard deviation 13.8). Although
somewhat lower than other SSVEP rates reported in the literature, we found that subjects exhibited higher SSVEP accuracy
when using the entire system with closed-loop feedback. Results obtained for the three different performance metrics are
shown in Table 1. In the table, we also include for comparison
purposes the minimum values for these metrics, assuming a
user with 100% SSVEP accuracy.
The results indicate that for all three metrics, subjects
demonstrate improved performance using the hierarchical
BCI: both the mean and variance for all three performance
metrics are lower when using the hierarchical BCI compared to the low-level BCI. Results from the best performing subject provide interesting insights regarding the use of
high-level commands in a hierarchical BCI. Due to the high
SSVEP accuracy of this subject, the difference in the mean
values between low-level and hierarchical modes of control
was less, but the variance for low-level control was significantly greater than for higher-level control (Table 1). This is
corroborated by the navigational traces in Figure 3, where we
see that trajectories from the hierarchical BCI tend to follow
the minimal path to the goal location based on the learned
representation in the neural network. This result confirms the
expectation that the network learns an interpolated trajectory
that minimizes the variances inherent in the training trajectories, with more training data leading to better performance.
3.3
Study II: Uncertainty-Guided Actions and
Multi-Tasking
An important observation from Study I was that the learned
high-level commands were not reliable in parts of the task
space where there is insufficient training data. Ideally, we
would like the system to identify if it is able to safely execute
the desired high-level command, preventing potentially catastrophic accidents. We investigated such an approach in Study
II by utilizing Gaussian processes (GP) for learning instead
of RBF networks.
The experiments were conducted with the subject that performed best in Study I. The navigation task was similar but
used a room that was 2.25 times larger and had various obstacles. The enlarged size and presence of non-wall shaped
obstacles increased the difficulty of robot navigation by requiring longer and more focused control. The environment
had two overhead lights on the right and left side of room that
could be controlled in the multi-tasking task. Additionally,
Study II also varied the starting position of the robot, making
1650
Figure 4: Navigation traces comparing RBF and GP models for learning. The white region in the GP plot represents
the high confidence region where autonomous navigation is allowed; user input is solicited whenever the robot enters a high
uncertainty (dark) region where there was insufficient training data.
the learning problem more challenging.
There were four days of experiments; two days of RBF
runs on the new environment, and two days of GP runs on the
new environment. On the first day for each type, the user was
instructed to alternate runs of training and testing. In Figure
4, starting points S2, S4, S6 represent test starting locations,
and S1, S3, S5 represent starting points of the robot in training
mode. The second day only involved test trials from each
of the six starting locations based on the first day’s learned
model. Additionally, for GP runs, to test the ability to multitask, the user was instructed to turn on the lights on the side
of the environment where the goal of the high-level command
was located once the robot started autonomously navigating.
We measured two performance metrics (Figure 5): time
spent controlling the robot using low-level control versus
high-level commands (‘Navigation time’), and number of selections the user had to make to achieve a given task (‘Number of selections’). To compare the performance of GP to
RBF learning, we measured the success rate of the high-level
commands, defined by number of times a high-level command was successfully executed (i.e., the robot reached the
destination) divided by number of times a high-level command was selected. Note that lack of success implies that the
robot experienced a fall or another mode of failure.
3.4
!"
!$%&'
!$%&'
!"
#
Figure 5: Performance of the GP-based Hierarchical BCI
over 2 Days. Note the larger amount of time spent in the
hierarchical mode and the greater number of high-level commands issued on day 2, indicating increased autonomy. This
increased autonomy allows a greater degree of multi-tasking,
as seen in the plot.
Study II: Results
The user successfully performed the entire experiment as instructed, managing a total of 24 runs over four days. As
shown in Figure 5, the GP-based hierarchical BCI resorted
to frequent user guidance on Day 1 (large amount of time
and selections for low-level). On Day 2, however, the user
was able to invoke a learned high-level command, resulting
in a larger number of selections and large amount of time for
1651
60
and the ICT Collaborative Project BrainAble (247447). We
thank Rawichote Chalodhorn for helping with the HOAP2 robot and Webots programming aspects of the project,
and Josef Faller for helping with the implementation of the
SSVEP stimuli.
40
References
Success rate (%)
100
80
RBF
GP
Multitasking
[Bell et al., 2008] C.J. Bell, P. Shenoy, R. Chalodhorn, and
R.P.N. Rao. Control of a humanoid robot by a noninvasive
brain–computer interface in humans. Journal of Neural
Engineering, 5:214, 2008.
[Cyberbotics Ltd., 2010] Webots.
http://www.
cyberbotics.com/, 2010.
[Online; accessed
12-13-2010].
[Faller et al., 2010] J. Faller, G. Müller-Putz, D. Schmalstieg, and G. Pfurtscheller. An application framework for
controlling an avatar in a desktop-based virtual environment via a software ssvep brain-computer interface. Presence: Teleoperators and Virtual Environments, 19(1):25–
34, 2010.
[Galán et al., 2008] F. Galán, M. Nuttin, E. Lew, P. Ferrez,
G. Vanacker, J. Philips, and J. del R. Millán. A brainactuated wheelchair: Asynchronous and non-invasive
brain-computer interfaces for continuous control of robots.
Clinical Neurophysiology, 119(9):2159–2169, 2008.
[Müller-Putz and Pfurtscheller, 2007] G. R. Müller-Putz and
G. Pfurtscheller. Control of an electrical prosthesis with an
SSVEP-based BCI. Biomedical Engineering, IEEE Transactions on, 55(1):361–364, 2007.
[Müller-Putz et al., 2005] G. R. Müller-Putz, R. Scherer,
G. Pfurtscheller, and R. Rupp. EEG-based neuroprosthesis control: a step towards clinical practice. Neuroscience
Letters, 382:169–174, 2005.
[Rao and Scherer, 2010] R.P.N. Rao and R. Scherer. Braincomputer interfacing. IEEE Signal Processing Magazine,
27(4):152–150, July 2010.
[Rasmussen, 2004] C.E. Rasmussen. Gaussian processes in
machine learning. Advanced Lectures on Machine Learning, pages 63–71, 2004.
[Scherer et al., 2008] R. Scherer, F. Lee, A. Schlogl,
R. Leeb, H. Bischof, and G. Pfurtscheller. Toward
self-paced brain–computer communication: Navigation
through virtual worlds. Biomedical Engineering, IEEE
Transactions on, 55(2):675–682, 2008.
[The Gaussian Processes Web Site, 2011] Gpml
matlab
code version 3.1. http://www.gaussianprocess.
org/gpml/code/matlab/doc/, 2011.
[Online;
accessed 01-24-2011].
[The MathWorks, Inc., 2010] Matlab newgrnn.
http:
//www.mathworks.com/help/toolbox/nnet/
newgrnn.html, 2010. [Online; accessed 12-13-2010].
[Velliste et al., 2008] M. Velliste, S. Perel, M.C. Spalding,
A.S. Whitford, and A.B. Schwartz. Cortical control of a
prosthetic arm for self-feeding. Nature, 453(7198):1098–
1101, 2008.
20
0
RBF
GP,Multitasking
Figure 6: Success Rate of High-Level Commands. As expected, the GP-based hierarchical BCI has a much higher
success rate due to its ability to recognize highly uncertain
regions and obtain training data for these regions on an as
needed basis.
high-level commands. This allowed the user to multi-task and
select the appropriate light to turn on, while the robot was
autonomously navigating (“Multitasking”). Figure 6 compares the success rate of high-level commands for GP versus RBF-based hierarchical BCIs. As expected, the GP-based
BCI exhibits a higher success rate for performing high-level
commands due to its ability to switch to user-control in lowconfidence areas.
4
Summary and Conclusion
BCIs for robotic control have in the past faced a trade-off between cognitive load and flexibility. More robotic autonomy
[Bell et al., 2008] implied coarse-grained control and less
flexibility, while fine-grained control provided greater flexibility but higher cognitive load. This paper proposes a new
hierarchical architecture for BCIs that overcomes this tradeoff by combining the advantages of these two approaches.
Our results from two studies using EEG-based hierarchical
BCIs demonstrate that (1) users can use the hierarchical BCI
to train a robot in a simulated environment, allowing learned
skills to be translated to high-level commands, (2) the problem of day-to-day variability in BCI performance can be alleviated by storing user-taught skills in a learned model for
long-term use, allowing the learned skill to be selected as a
high-level command and executed consistently from day to
day, (3) a probabilistic model for learning (e.g., GPs) can be
used to mediate the switch between high-level autonomous
control and low-level user control, safeguarding against potentially catastrophic accidents, and (4) the hierarchical architecture allows the user to simultaneously control multiple
devices, opening the door to multi-tasking BCIs. Our ongoing efforts are focused on testing the approach with a larger
number of subjects and investigating its applicability to other
challenging problems such as controlling a robotic arm with
grasping capabilities.
Acknowledgments
This work was supported by the National Science Foundation
(0622252 & 0930908), the Office of Naval Research (ONR),
1652
Download