Interaction Without Gesture or Speech – A Gaze Controlled AR System

Interaction Without Gesture or Speech – A Gaze
Controlled AR System
Susanna Nilsson
Linköping University, SE-581 83 Linköping, Sweden
Even though gaze control is relatively well known as a
human computer interaction method, gaze control is not
a widely used technique, partly due to several usability
issues. However, there are applications that can be
improved with the use of eye gaze control. Combining
concepts from the Augmented Reality (AR) research
domain with concepts from gaze recognition and input
gives possibilities to create quick and easy interaction. In
this poster, one such application is described - an AR
system for instructional use, with a gaze attentive user
which in this system is implemented by placing a Near
Infra Red (NIR) illumination source next to the gaze
recognition cameras optical axis. The camera detects the
pupil and it’s reflections by filtering and thresholding the
image information. The position of the pupil and the
positions of the reflections on the cornea caused by the
NIR illumination are calculated.
Key words: Augmented Reality, Mixed Reality, Gaze
Control, Gaze Interaction, Gaze Awareness
1. Introduction
In Augmented Reality (AR) systems, real and virtual
objects are merged and aligned relative a real
environment, and presented in the field of view of a user.
The user can either be a passive recipient of information
or an active part of an interaction process. AR
applications that give hierarchical instructions to users
often require some feedback from the user in order to
move to the next step in the instructions. This feedback
should be possible to give quickly and without
interruption from the ongoing task. AR systems that use
gestures and speech for interaction have been developed
[1, 2, 3]. However there are examples of situations where
speech and gesture may not be appropriate. This poster
presents an AR system with an integrated gaze tracker,
allowing quick feedback from the user to the system, as
well as analysis of the users gaze behaviour.
2. Description of the system
The AR system described here, uses a hybrid marker
tracking technology, based on ARToolKit, ARToolKit
Plus and ARTag, but with the addition of a 3DOF
inertial tracker (InterSense and Xsens) [4]. Software has
been developed with the aim to permit an application
developer to define applications and scenario files in
XML syntax. To allow gaze recognition two black/white
cameras have been integrated into the helmet-mounteddisplay (see figure 1). The displays have a resolution of
800 x 600 pixels and a field of view of 37 x 28 degrees.
The gaze tracking is based the dark pupil principle,
Figure1. To the left a head mounted gaze controlled
AR system, top right a gaze pattern of a user working
with the gaze interaction dialog seen in figure 5. The
bottom right image shows the gaze trackers view of
the user’s eye and the NIR reflection in the pupil [6].
Four NIR light sources are used as can be seen in the
bottom right image in figure 6. However, only two
reflections are needed for the calculation of the gaze.
The system can choose between these four different
reflections which increases the robustness in the system.
Interaction dialog In the developed system, eye gaze interaction can be
restricted both temporally and spatially - certain parts of
the display will have the function, and only when there is
a need for gaze interaction. The application designer can
define the layout of the gaze control dialog areas, as well
as gaze action specifications and dwell times, directly in
the XML file without any changes made to the general
AR system.
The gaze dialog area positions can either be fixed in
certain regions of the display, or dynamic relative to the
detected marker position which allows flexible design of
the application. The interaction area can either be
invisible, transparent or opaque. Placing the interaction
dialogue in the lower part of the display is more gaze
friendly than using the upper part of the display, but the
problem of accidentally activating the gaze interaction
dialogue (the Midas touch problem) is more prominent
than with interaction areas in the top of the display (see
figure 2). When the user looks at one of the areas, this is
indicated by a change in appearance (color and image),
so that the user receives feedback, acknowledging that
the system ‘knows’ that s/he looks at the area.
Figure 2a. To the left, a static gaze interaction dialogue in the
lower area of the users field of view. The dialogue is only visible
when the user looks in the nearby region of the no/yes boxes.
2b. to the right, the “yes” button changes appearance to inform
the user that it has been activated. If the user does not move the
gaze from the area within a set amount of time, the AR system
will interpret the response as a “yes” response.
In the application described, the point of gaze fixation is
not visible when it is positioned outside of the active
interaction areas. It is possible to show the gaze fixation
point at all times if desirable. Interaction feedback is
given to the user in terms of changed image and color of
the “button” (see figure 2b).
3. Preliminary pilot user study of the system
Three different gaze dialogues have been informally
tested in a laboratory setting. Only three participants
took part and the main goal of the pilot study was to test
the system robustness and functionality. Two AR
applications previously used in other user studies were
implemented and adapted to gaze control. First the users
tested the static, upper interaction dialogue design. A
few days later the users tested the dynamic design
alternative, and on the third occasion they tested the
design with static dialogue in the lower region of the
display (figure 2).
Results and discussion of the preliminary tests During the first trial it was found that the users tended to
turn and tilt their head so that the focus of attention
always is in the centre of the field of view. Looking at
things in the upper section of the display was therefore a
conscious effort and not part of casual gazing. It was
reported to be strenuous to interact with this gaze angle,
and the users preferred the lower static interaction
dialogue. There were however some problematic issues
with the lower interaction dialogue as well – when the
users tried to activate the dialogue they tended to tilt
their head, thus often losing camera contact with the
marker, which in turn lead to the loss of the visual
instructions. The static dialogues are hence not ideal
since they are not adapted to normal human behavior –
when humans want to focus their gaze on an object,
virtual or real, they tend to place it in the central field of
The dynamic dialogue does not have the problem of the
users tilting and moving their heads. However one
important and expected problem was found. Although
the interaction dialogue was only visible when the
system required an input from the user, it still sometimes
covered too much of the user field of view. This caused
the participants to experience it as being cluttered. This
could be addressed with a redesign of the interaction
dialogue in future development of the system.
In general, the participants were positive to the concept
of gaze control in these types of applications, but the
system was experienced as clumsy and not entirely
stabile since they sometimes lost the virtual information
when the marker was not detected by the camera. These
are problems that can be addressed by further refining
the AR system. The clumsiness of the system is harder
to address in the technical solution presented here.
Video-see-through AR and gaze control requires
cameras (two at a minimum) and these are too heavy to
be comfortably placed on an HMD. For gaze controlled
video see-through AR a helmet-mounted solution is
currently the best option.
4. Conclusions
In this poster we have described an AR system with a
gaze attentive user interface. The preliminary pilot studiy
indicate that although there are limitations to the
proposed AR system, it is functional and can be used for
applications such as the ones described in this paper.
There is however much need for further improvements
and user studies.
The gaze-controlled AR system was built in close
cooperation with the Swedish Defence Research
Agency, and the project was funded by the Swedish
Defence Material Administration.
