approach - Fordham University Computer and Information Sciences

advertisement
LOOKING AT YOU: COMPUTER GENERATED GRAPHICAL –
HUMAN INTERACTION
Giselle Isner, isner@cis.fordham.edu
Computer and Information Science Department
Fordham University
NY, NY 10023
Advisor: Dr. Damian Lyons
INTRODUCTION
Currently computers do not interact with their users in the same manner people
interact with each other (speaking, gesturing, body language, etc.), but it has been
argued that this would facilitate better human-computer interaction [3]. We have
already heard of robots which behave like living dogs (Sony’s AIBO), or who can aid
in cleaning up a home (iRobot’s iRoomba). We already interact with our computers by
clicking on a mouse, or entering commands from the keyboard, but how much easier it
would be if we could simply dictate things to our computer, or even point to the
screen, and get a response as quickly as if we were speaking to a human. There is
already a limited technological basis for this kind of interface: Vivid’s Mandala
system, which can be found at museums and amusement parks, allows users and
computer generated graphics to interact with one another using a video camera for
input and green-screen technology.
Our objective was to construct a pair of graphical eyes and have these interact
with people by following them, where the tracking information was generated by a
computer vision system [2]. Investigating the feasibility of this construction is a first
step in determining whether human-computer interaction is facilitated by this kind of
interface.
APPROACH
Using Inventor [1], an object-oriented 3D graphical toolkit, and C++, we were able
to create a pair of 3D graphical eyes displayed on a computer screen.
Initially, the eyes were given their own behaviors, autonomous vertical and
horizontal movements. This created the illusion that the eyes were alive. It was
accomplished through a timer routine which was called every 1/10th of a second. This
routine caused the eyes to rotate to a certain value on either the x or y axis, then rotate
to the corresponding negative value.
Our next step was to make the pair of eyes converge and diverge at a certain angle
at a certain distance in order to have the eyes “focus” on a moving object at any
distance and at any location within its field of view. This would give a user the illusion
that the eyes were looking at him.
To complete the illusion, the graphical eyes needed to be provided with
information about the location of the closest person to the computer display on a realtime basis [2]. A computer vision system connected to an overhead camera extracted
the largest region of motion in the camera view and sent the image coordinates (u, v)
of the centroid region to the graphical eyes module. In this camera configuration the
distance of the user from the display is proportional to the vertical (v) image
coordinate. This relies on the assumption that the floor is a plane, the user remains on
the floor, and the y axis of the image is parallel to the floor.
To provide a common frame of reference for the graphical and vision systems, we
created a C++ class, Linearmap, to linearly relate the field of vision of the camera to
the field of vision of the eyes. Linearmap mapped (u,v) image coordinates to (x, y)
inventor coordinates. The values of the slope and offset parameters in the linear
mapping routines were chosen empirically to calibrate the motion of the eyes with
actual target movements.
As a final touch, we wanted the eyes to return to their independent behaviors.
When the same (x,y) values were generated by the program ten times in a row,
signaling a halt in the motion of the object, the eyes switched to their own independent
movements.
CONCLUSION
Inventor simplified the creation of the 3D graphical eyes. A single eye was created
in an Inventor text file and the file was read in twice (one for the left eye and one for
the right eye) into the main program, written in C++. However, acquiring sufficient
information to Inventor routines and variables was time-consuming. Furthermore,
certain routines in Inventor (timer routines) were not very compatible with aspects of
object-oriented design. These discrepancies were fixed fairly easily, if not elegantly,
with the addition of a few extra functions.
The eyes worked very well in their ability to follow motion. They immediately
followed whatever motion they sensed and remained stationary when the motion
stopped. However, there were some problems. When the user walked up to the screen
where the eyes were displayed and stood above it, the user appeared at the top of the
camera’s field of view, and therefore the eyes looked up at the user. However, the user
could bend down right in front of the screen, yet because his position in the camera’s
field of view remained the same, the eyes continued looking up. In addition, the
object’s motion was tracked through the center of the object; thus the eyes looked
primarily at the center of the user instead of at his face. In order to remedy this, we
had to choose between the eyes accurately following the user when he was up close to
the screen or following him when he was further away from the screen. We ultimately
decided that it was best for the eyes to follow the user with more precision when the
user was closer to the screen.
One interesting observation that was made was that although the eyes were able to
follow motion quite accurately, the eyes did not appear to be looking directly at the
user at all times. However, from the perspective of onlookers watching the
interaction, the eyes did appear to be looking directly at the user. Perhaps this is why
the magic-mirror effect in augmented reality (the user stands in front of a green screen
and appears on a computer screen with graphical entities appearing around him) is
more appealing to users (e.g., Mandala or MIT’s ALIVE). In this instance, the user, in
effect, becomes the onlooker watching the graphics interact with another.
There were some features we developed which we chose not to employ. We gave
the eyes the ability to blink while retaining horizontal and vertical movement. We felt,
however, that the blinking would distract the audience from the eyes’ other behaviors.
FUTURE
Determining when someone is looking at us appears to be a basic and precise
human skill, and only when we receive direct eye contact with another do we feel a
connection. Thus, in order for a user to feel the same type of interaction with these
graphical eyes, it is important for them to look directly at the eyes of an individual,
even if the user is at a distance from the screen. Our future work will include
experimenting with the eyes converging and diverging at varying distances, and
determing the correct angle needed for them to appear tolook directly at the user. This
could be done using multiple cameras, so that the input for the eyes would not be
dependent on just one field of view, but switch to another if the camera detects
specific types of motions of changes in image coordinates, such as someone bending
down in front of the eyes.
REFERENCES:
[1]
[2]
[3]
The Inventor Mentor. Wernecke, Josie. Open Inventor Architecture Group.
Addison-Wesley, 1994.
Lyons, D., Pelletier, D., and Knapp, D. Multi-modal Interactive Advertising.
1998 Perpetual User Interface Workshop, San Francisco, California. Nov.
1998.
Emerging Paradigms for User Interaction, in: Usability Engineering. M.
Rosson & J. Carrol, Morgen-Kaufmann 2002.
Download