The GripSee Project

advertisement
The GripSee Project: A Gesture-controlled Robot for Learning
and Using Object Representations
Mark Becker, Efthimia Kefalea, Eric Mael, Christoph von der Malsburg, Mike Pagel, Jochen
Triesch, Jan C. Vorbruggen, Rolf P. Wurtz, and Stefan Zadel
Institut fur Neuroinformatik, Ruhr-Universitat Bochum, D{44780 Bochum, Germany
http://www.neuroinformatik.ruhr-uni-bochum.de/ini/
This talk introduces GripSee, our robot platform for gesture and object recognition, object
manipulation and learning. As modeling visual perception invariably means modeling human
visual perception we have chosen to let the robot components, specically the arrangement and
kinematics of camera head and robot manipulator, closely resemble the human eye, head and
arm arrangement, including kinematic redundancy to enable obstacle avoidance and leave some
choice in grasping and manipulation, with the additional advantage that the arm can avoid
occluding its own workspace. The camera system imitates human eyes in supplying a stereo
system with a high-resolution, color sensitive fovea and a wide-angle, low resolution, grey value
periphery camera for motion detection and saccade planning.
Our approach to robot behavior might be called semi-autonomy: the robot must dispose of a
repertoire of skills that are carried out autonomously, byt the actual control of overall behavior
must be left to a human operator. Although classical techniques are readily available for this
control, we have chosen to implement a gesture interface which is suited for use by technically
untrained people, a decision dictated by the quest to provide a prototype for a service robot.
All individual robot components are controlled by autonomous agents that communicate
with each other and with other agents that implement specic skills, basic operations that can
be composed in various ways to implement specic tasks. Central control is maintained by a
separate agent that denes task timelines and contains a user interface. On the implementation
level, each agent basically is one executable under Unix or QNX, respectively (see gure 1 for
an overview of all agents). A central skill is xation, i.e., directing the axes of both eyes towards
a point of interest. It is calibrated by an autonomous procedure [3] and serves as the principal
transformation for the extraction of 3D information because the three-dimensional position of
the xated is region can be estimated by simple triangulation from the pan, tilt and vergence
angles of the camera head. For the sake of brevity, the skills are only listed as part of a standard
example behavior, but they can be recombined in a variety of ways. That behavior is as follows.
Initial situation: GripSee stands before a table with various known objects and waits for a
human hand to transmit instructions.
Hand tracking and xation: The hand is tracked to follow the pointing movement. After
the hand movement stops, the hand is xated.
Hand posture analysis is executed on the distortion-reduced small central region of interest,
and the type of posture is determined.
Object Fixation: It is expected that the hand points at an object, thus the xation point is
lowered about 10 cm. Now, a simple localization algorithm detects the object, which is then
xated. After three or four iterations, the object is xated well, and a new region of interest
is dened.
Object recognition determines the type of object, its exact image positions and sizes and its
orientation.
Fixation: The recognition module yields a more precise object position than the preceding
xation. That precision is crucial for reliable grasping, thus the object is xated once more.
Neural networks are used for xation and triangulation in order to cope with the nonlinearities of the camera head.
Grip planning: A grip suitable for the object and the hand posture is selected from a database
and transformed into the object's position.
Cameras /
Framegrabber
Acquisition
Server
Hand Tracking
Client
Gesture Rec
Client
Object Rec
Client
Left Image
Object Rec
Client
Right Image
NEUROS
Control
Server
Head
Server
Camera
Head
Robot Control
Server
Grip Finder
Client
Robot Arm
Fig. 1. Current structure of server and client processes, each of which is an implementation of an autonomous
agent. Another server for tactile information is to be included soon.
Trajectory planning and grip execution: A trajectory is planned, arm and gripper are
moved to the object, and nally the object is grasped. Another trajectory transfers it to
a default position conveniently located in front of the cameras for possible further inspection.
Hand tracking, hand xation and position analysis: GripSee again watches the eld of
view for the operator's hand to reappear, this time pointing with a single dened gesture to
the three-dimensional position of the place to put the object down.
Trajectory planning, execution, and release: A trajectory is planned, the arm moves the
object to the desired location and the gripper releases it. In the absence of further clues, the
3D-precision of that release is an immediate measure for the quality of hand localization.
Additional functionality, which is currently being worked on, includes the construction and
incorporation of tactile sensors and the semi-autonomous learning of new objects. More detailed
information can be found on our website and in the articles [1, 2].
This work is funded by grants from the DFG (Graduiertenkolleg KOGNET) and the German
Federal Minister for Education and Research (01 IN 504 E9). We thank Michael Potzsch, Michael
Rinne, Bernd Fritzke, Herbert Janen, Percy Dahm, Rainer Menzner, Thomas Bergener, and
Carsten Bruckho for invaluable help to get various aspects to the point of actual functioning
and to Carsten Brauer for lming the video.
References
1. M. Becker, E. Kefalea, E. Mael, C. v. d. Malsburg, M. Pagel, J. Triesch, J. Vorbruggen, and S. Zadel. GripSee:
a robot for visually-guided grasping. In Proceedings of ICANN, Skovde, Sweden, 2-4 September, 1998, 1998.
In press.
2. M. Becker, E. Kefalea, E. Mael, C. von der Malsburg, M. Pagel, J. Triesch, J. C. Vorbruggen, R. P. Wurtz, and
S. Zadel. GripSee: A gesture-controlled robot for object perception and manipulation. Autonomous Robots,
1998. Submitted.
3. M. Pagel, E. Mael, and C. von der Malsburg. Self calibration of the xation movement of a stereo camera
head. Autonomous Robots: Special Issue on Learning in Autonomous Robots, 1998. In press.
Download