A Fast and Robust Fingertips Tracking Algorithm for Vision

advertisement
2013 10th IEEE International Conference on Control and Automation (ICCA)
Hangzhou, China, June 12-14, 2013
A Fast and Robust Fingertips Tracking Algorithm for Vision-Based
Multi-touch Interaction
Qunqun Xie1 Guoyuan Liang1 Cheng Tang1 and Xinyu Wu1,2
It’s well-known that people usually touch things of interest with fingers. Hence in a multi-touch system, the most
challenging work is to identify the locations of fingertips. In
recent years fingertip tracking has received great attentions
of researchers all over the world. Fingertip tracking, in
general, is still a tough problem due to its flexible shapes
and high degree of freedom (DOF). Therefore, people tend
to employ extra sensors, typically mechanical or optical,
to capture fingertip positions directly [9-11]. While with
the development of computer vision, various vision-based
fingertip tracking algorithms have been reported. Some of
these approaches utilize the geometry properties of hand,
e.g. curvature, edge or shape, and build a model to locate the
fingertips. Dominguez, et al. presented a curvature analysis
based method to track the fingertips by employing a headmounted camera system [12]. Hongwei Ying et al. proposed
a fingertip detection algorithm by analyzing edges of fingers
after segmentation from depth images captured by a trinocular vision system [13]. Kim, S. and Pak, Y. et al. described a
I. INTRODUCTION
method based on Active Shape Models (ASM) and an ellipse
The multi-touch technology has seen rapid growth in equation to detect and track fingertips without using skin
recent years and is currently one of the research hotspots in color [14]. Tony Heap et al. first constructed a 3D deformable
Human Computer Interaction (HCI) [1]. The first multi-touch Point Distribution Model to track hand with single video
screen based on pressure sensing was designed by Nimish camera [15]. Some other vision-based tracking algorithms,
Mehta in 1982[2]. Since then, various multi-touch devices however, make use of certain image analysis techniques, such
and technologies have been launched by researchers all as template matching or color segmentation etc., to track
over the world [3-7]. Early implementations of multi-touch fingertip movements. The methods described in [16-17] emtechnologies are not only complex but also expensive until ployed circular and elliptical templates for fingertip tracking.
2005 when Jefferson et al. presented a FTIR-based solution Some researchers believe geometry structure contains the
which greatly reduces the cost of multi-touch technology most important information of fingers, and presented some
[8]. Currently the multi-touch technologies can be classified geometry structure based fingertip tracking algorithms [18into two categories: Senor-based and Computer Vision based. 19]. Most of these approaches, however, are computationally
Senor-based technologies integrate different types of sensors expensive and can only track 2D trajectories of fingertips.
into touchpad and directly receive finger touch as input. More recently, Daniel R. Schlegel et al. built a new visionAlthough it works well with various portable devices, the based interaction system named AirTouch [20] which is able
relatively high cost limits its applications to some extent. to track multiple fingertips in 3D space, but the user need to
Recently, the development of computer vision makes it wear a glove with marks.
possible to seek inexpensive vision-based solutions which
In this paper, we propose a fast and robust fingertip
may have good scalability as well as good performance.
tracking algorithm based on geometry structure model of
hand. Compared with existing methods, our algorithm at*This work described in this paper is partly supported by Shentempts to track movements of multiple fingertips not only
zhen Nanshan District Technical R&D and Innovative Design Fund
(KC2012JSYB0050A), Shenzhen Internet Industrial Developing Special
in 2D but also in 3D space without using any marks. A
Fund (JC201005270368A), and Guangdong Innovative Research Team
stereovision system is set up to retrieve depth information
Program (201001D0104648280).
1 Q. Xie, G. Liang, C. Tang, X. Wu are with Guangdong Provincial Key
of the scene. The algorithm detects the hand region using
Laboratory of Robotics and Intelligent System, Shenzhen Institutes
skin color filter as well as depth images reconstructed from
of Advanced Technology, Chinese Academy of Sciences. Email:
the stereovision system, and then calculates the position
{qq.xie,gy.liang,cheng.tang,xinyu.wu}@siat.ac.cn
of palm center. Finally based on the observation that the
2 X. Wu is also with the Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong SAR,China.
geometry structure of hands is almost identical, it’s possible
Abstract— Finger touch is the most natural way for human
interaction with the external world. In the past five years,
the great success of multi-touch trackpad on portable devices
implies the big potential of multi-touch technology to be applied
in vision-based human-computer interaction (HCI) systems.
The implementation of multi-touch technology highly depends
on accurate and fast fingertip tracking. In this paper we present
a fast and robust algorithm for tracking fingertip positions in
a stereovision-based 3D multi-touch interaction system. Our
method first detects the hand region by a two-step strategy
based on skin color filter as well as depth images. Then a
geometry model is built to locate the fingertips. The accuracy
and effectiveness of the fingertip tracking algorithm is examined
over several video sequences with complicated backgrounds.
Experimental results verify that our algorithm can reliably and
accurately track the movements of fingertips in real time. The
effectiveness of the fingertip tracking algorithm also reveals
the capability allowing user to interact with computers through
their finger movements in 3D space over a virtual thin-film-like
touch surface which is set up at a certain distance away from
the screen.
978-1-4673-4708-2/13/$31.00 ©2013 IEEE
1346
utilizing the geometry relation between palm center and hand
contour to determine the fingertip locations on the contour.
Accuracy and effectiveness of this algorithm are examined
over several video sequences with complicated backgrounds.
Experimental results verify that our algorithm can achieve
stable and efficient performance during the tracking process.
We are now building a 3D virtual multi-touch interaction
system in our lab. A virtual thin-film-like touch surface is
set up at a certain distance in front of the screen. By a
stereovision-based fingertip tracking system, user can pull,
push, spin or twist the virtual surface in 3D space, which
may activate different inputs into the computer. The system is
expected to greatly improve the richness of user experience.
The rest of the paper is organized as follows: In Section
II, we will present the efficient hand region localization algorithm based on skin color and depth information. Section III
describes the robust and fast approach for fingertip tracking.
Section IV gives a brief introduction on the 3D virtual multitouch interaction system now under development in our lab.
Experimental results and discussion are presented in Section
V. Finally, the conclusions are drawn in Section VI.
(a)
(b)
Fig. 1.
Detection of skin-color regions. (a) Original hand image; (b)
Extracted skin-color regions.
which can be estimated from training data as follows:
µs =
∑s
=
1
n−1
1
n
n
∑ cj
(2)
j=1
n
∑ (c j − µs )(c j − µs )T
(3)
j=1
where n is the number of training samples. Finally, the
Gaussian Mixture Model is defined by
II. EFFECTIVE HAND LOCALIZATION
k
In order to track the movements of fingertips, hand region
should be segmented from background first. Hand localization, however, is still a challenging problem due to its
high degree of freedom (DOF) and flexible shapes. For
the purpose of improving system efficiency, our approach
handles the problem in a simpler way. The hand is considered
as the closest object with skin color in front of the camera.
This assumption is acceptable in most of human-computer
interaction tasks. The hand region can be detected by a twostep method. First, a skin-color filter is applied to locate
the candidates of hand region.Then the hand is segmented
from depth images using depth clipping and region grow
algorithm.
P(c|skin) = ∑ λi pi (c|skin)
(4)
i=1
where k denotes the number of mixture components, λi
denotes the weight of each Gaussian model which satisfy
k
∑ λi = 1. The P(c|skin) can be used directly as the measure
i=1
of how ”skin-like” the color is. In this paper we set k to 5
and the Gaussian Mixture Model parameters are evaluated by
the well-known Expectation Maximization (EM) algorithm.
After extracting skin-like regions from background, we
apply morphological operations on the extracted regions for
the purpose of removing noises. In this stage, some objects
with skin-like color are also extracted as candidates of hand
region. Fig. 1 shows the result for skin-color region detection.
A. Skin Color Filter
Skin-color has been proven to be an effective cue for
extracting hand and face regions from background. Basically,
skin color detection is to define decision rules and build
a skin color classifier. The main difficulty is to find both
appropriate color space and adequate decision rules.
In our algorithm, we choose YCbCr color space for skincolor segmentation [21]. YCbCr color space separates the
color information to three channels: luminance, chrominance
and compactness, and is appropriate for skin-color segmentation. In addition, a parametric model, the Gaussian Mixture
Model is also employed to describe the skin-color distribute.
For a single Gaussian model, skin-color probability distribution p(c|skin) is defined as follow[22]:
p(c|skin) =
T −1
1
1
e− 2 (c−µs ) ∑s (c−µs )
2π | ∑s |1/2
(1)
here c is a color vector. µs and ∑ s are the model parameters
B. Hand Segmentation from Depth Image
As mentioned before, hand is assumed to be the closest
object with skin-color in front of the camera. Therefore,
the hand can be identified from all the candidates by depth
clipping technique and region grow algorithm.
Note that the hand region in depth image is continuously
distributed and extends within a limited 3D space, so the
points with minimum depth are picked as seeds. By applying region grow algorithm, the hand region is segmented
from the background. Fig. 2 illustrates the process of hand
segmentation.
Unfortunately, the extracted hand region from above step
sometimes contains not only the hand but also a small part of
wrist. In order to get a clean hand segmentation, we follow
the work by zhenyao et al. [18] and divide the wrist and hand
by a boundary curve Bw , as shown in Fig. 3. The definition
of BW can be found in [18].
1347
(a)
(b)
(d)
(c)
(e)
Fig. 2. Hand region extraction. (a) Original hand image; (b) Depth image
of (a); (c) Extracted region after skin-color filtering and noise removal; (d)
Extracted region after applying depth clipping and region grow algorithm;
(e) Hand region: intersection of (c) and (d).
Fig. 3.
The wrist and hand are divided by a boundary curve Bw .
III. FINGERTIP TRACKING
Fingertip tracking is the key part in a multi-touch system.
In this section, we will describe a fast and robust fingertip
track algorithm. Different from most existing approaches
which are usually based on template matching, curvature
analysis or color marks, our approach utilizes the geometry
information of hand shape to determine the positions of
fingertips. Due to its simplicity and effectiveness, our method
is expected to make faster and more stable tracking on
fingertips.
A. Palm Center Localization
In order to find positions of fingertips, the center and the
size of palm should be estimated first. Most of current works
apply repeated morphological operations until the object is
small enough to indicate the location of palm center. It is
usually time-consuming and difficult to decide when to stop
for there is no explicit rule for stopping. In general, most
of current works stop in terms of the size of hand region.
The size, however, changes along with the depth and shape
of hand. In addition, the ambiguous definition of a ”small
enough” size also leads to the ambiguity of ”best” palm
center. In this section, we propose a projection based method
for palm center extraction which is more effective and more
simple.
We notice the palm is a rectangle-like region, more or less.
Based on this observation, a projection based algorithm is
Algorithm 1 Palm Region Extraction
Input:
Binary image of hand IH
Projection angle interval ∆θ = 45o
A predefined threshold λ
Output:
Binary image of palm IP
1: Initialize IP with 0
2: Construct the set of projection
⌊ angles
⌋ Θ = {θi = i∆θ , i =
o
0, ±1, · · · , ±N − 1, N}, N = 180
∆θ
3: Define a point set L and set it to empty
4: Project the hand image in all directions
angle count=0
w =width of IH , h =height of IH
FOR each θk ∈ Θ
angle count=angle count+1;
FOR each pH (i, j) ∈ IH , 0 ≤ i < w, 0 ≤ j < h
IF pH (i, j) ̸= 0
point count= 0
WHILE pH (i + cos θk , j + sin θk ) ̸= 0
put pH (i, j) into line L
i = i + cos θk , j = j + sin θk
point count=point count+1
END WHILE
IF point count≥ λ
FOR each point pL (il , jl ) ∈ L,
l = 1, 2 · · · , point count
IF pP (il , jl ) = angle count − 1, pP (il , jl ) ∈ IP
set pP (il , jl ) = angle count
empty L
END IF
5: Output palm image IP
FOR each pP (i, j) ∈ IP , 0 ≤ i < w, 0 ≤ j < h
IF pP (i, j) ̸= angle count
set pP (i, j) = 0
proposed to extract palm region. The basic idea is to project
the hand region in all directions. If the projection line goes
through only one block of the hand region, that implies this
block is a candidate of palm region and should be preserved.
After projections along all directions, the intersection of all
candidates will make a final palm region. This algorithm is
formulated in Algorithm 1.
Normally it’s unnecessary to project the image in all
directions because the extracted region will not change much
when the projection angle interval is smaller than a certain
value. Here we do the projection every 450 from 00 to ±1800
experientially. Compared with shape model based methods,
our algorithm is faster and simpler. Fig. 4 shows an example
for the palm region and palm center extraction.
Palm center C0 is defined as the point in the palm
region which has maximum distance from the closest palm
boundary[18]:
1348
C0 = arg min{
min
P∈R palm ,PB ∈B
(d2 (P, PB ))}
(5)
an index to each point in the candidate set, then sort the
set by the index. A function ϕ is defined for calculating the
index
θ p f = ϕ (Pf ,C0 )
(8)
Pf ∈F
(a)
(b)
(d)
here Pf is the point in the candidate set F and θPf is the
index of Pf . Actually ϕ has a clear physical meaning, i.e. the
angle of inclination of the line Pf C0 with negative x-axis.
The candidate set is sorted by θPf in ascending order
(clockwise). Then the distances between successive points
are calculated as follows to determine the start and end point
of several subsets (the contour points for each finger) of the
candidate set
(c)
(e)
Fig. 4. Palm region and palm center extraction. (a) Original hand image;
(b) Extracted hand region; (c) Preserved hand region when the projection
angleθ =00 ; (d) Preserved hand region when the projection angle θ =900 ;
(e) Extracted palm region: intersection of all preserved hand regions when
projection angle varies from 00 to ±1800 at an interval equals to 450 . The
red point is the palm center.
where P denotes the point inside palm region R p , PB denotes
the boundary B of palm region, and d2 is the function for
calculating 2D Euclidean distance between two points. The
size of palm R is defined as the distance between C0 and the
closest boundary.
R = min(d2 (C0 , P))
P∈B
DPi =
d2 (Pi , Pi+1 )
Pi ,Pi+1 ∈F
(9)
here, D pi denotes the distance between successive points Pi
and Pi+1 . If D pi is greater than a predefined threshold δ (in
our system, it is set to 2.5), Pi is considered as the start or end
point of a subset. Therefore, all points in the candidate set
can be divided into several subsets which are disconnected
on the hand contour. Meanwhile each of them corresponds to
one fingertip contour. Finally we compute the distance from
each point in the subset to the palm center. The one with
maximum distance is identified as a fingertip. The process
of fingertip localization is illustrated in Fig. 5.
(6)
B. Fingertip Localization
The fingertip is considered as the point with maximum
distance to the palm center on the contour of each finger.
Here we utilize another two-step method to locate the
fingertips: Contour for each finger is extracted first. Then
the point on the contour with maximum distance to the
palm center is picked as fingertip. Finger contours can be
regarded as subsets of hand contour. Here we extract the
hand contour following the algorithm proposed by Suzuki
and Abe et al. [23]. The contour which makes the maximum
area is considered as the hand border. Then the distances
between contour points and palm center C0 are calculated.
If a distance is larger than a predefined threshold, this
contour point will be put into a candidate set of fingertips
immediately, as formulated in the following equation
F = {Pf |d2 (P,C0 ) > α R, P ∈ B}
(7)
here P denotes a contour point, d2 (P,C0 ) denotes the distance
between palm center C0 and P, α is a scale factor, we set it
to 1.2 empirically. F is the candidate set of the fingertips.
Once the candidate set is generated, a simple but effective
approach is used to identify the fingertips in the candidate
set. Firstly we need to determine the contour points for each
finger in the candidate set. This can be fulfilled by tracing
all the contour points and compared it with the points in the
candidate set. In order to decrease the computing time, we
present a more effective solution here. The idea is to assign
(a)
(b)
(d)
(c)
(e)
Fig. 5. Fingertip detection. (a) Extracted Hand region; (b) Hand contour;
(c) Candidate set of fingertips on the hand contour. (d) Calculation of index
θ p f for each point in the candidate set. The set is sorted by the index. The
indexes are illustrated using a color coding scheme from pure green (the
minima) to pure red (the maxima). The points are sorted in ascending order
(clockwise); (e) Extracted fingertips (green points) and the start and end
point for each fingertip region (red points).
IV. THE MULTI-TOUCH SYSTEM
There is a 3D virtual multi-touch interaction system now
under development in our lab. Normal multi-touch systems
usually need a special made touchpad to capture fingertips’
movements. Our vision-based multi-touch system, however,
doesn’t require any pressure sensing devices, and the user
1349
doesn’t need to touch the surface physically. In this system, a
virtual thin-film-like touch surface is laid at a certain distance
away from the screen. The movements of user’s fingertips
can be tracked by the stereovision system in real time.
The user can interact with computer by pulling, pushing,
spinning or twisting the virtual elastic surface in 3D space.
Daniel, Albert et al. built a similar interaction system named
AirTouch [20]. A marked glove is required to track the
fingertips and only one fingertip touch on the 2D virtual
screen can be recognized. Our system can identify multifinger touch without any marks, and allow user to interact
with computer in 3D space as well. The system is expected
to greatly improve the richness of user experience.
Our 3D virtual multi-touch system is developed on the
Window 7 platform. The stereovision system captures the
depth image at a frame rate of 20Hz. We apply TUIO (A
Protocol for Table-Top Tangible User Interfaces) protocol
to package the multi-touch inputs with timestamp and send
it to a TUIO client [24]. TUIO is an open framework
which defines a common protocol and APIs for multi-touch
surface. It allows the transmission of an abstract description
of interactive surfaces and has been mainly designed as an
abstraction for interactive surfaces. In our implementation,
the fingertip touch events are sent to a TUIO client through
the TUIO server after handled by the fingertip tracking
system. The architecture of the 3D virtual multi-touch system
is presented in Fig. 6.
Fig. 6.
experiment, the user is asked to touch the virtual touch
screen with one, two, three and four fingers respectively.
Four video sequences, each of them lasts 3 seconds, are
recorded for testing the fingertip localization algorithm. The
extracted positions of fingertips are sent to the TUIO client
through the TUIO server. The tracking results are shown in
Fig. 7. At each row, the first frame in the test video sequence,
the tracked trajectories of the fingertips on the virtual touch
screen, and the responses of multi-touch client are illustrated
from left to right.
The second experiment aims to test the accuracy of
the tracking algorithm. For each test video sequence, we
manually identify the positions of fingertips as the ground
truth frame by frame, and then compare them with the
positions detected by the tracking system. All together 360
frames are processed and the accuracy rates are calculated
for all sequence. Even with the presence of complicated
background, light changes and image noises, the total correct
detection rate still reach as high as 91.1%, as shown in Table
1.
TABLE I
T HE STATISTICAL RESULTS OF OUR FINGERTIPS TACKING
Video sequence for
test
ALGORITHM
Number of
frames
Recognition rate
90
96.7%(87/90)
90
92.2%(83/90)
90
88.9%(80/90)
90
86.7%(78/90)
360
91.1%(328/360)
The architecture of 3D vision-based virtual multi-touch system
Total
V. EXPERIMENTS AND DISCUSSION
The stereovision-based fingertip tracking system is developed on a Xeon 3.07Ghz workstation. The application
executes at a frame rate of 20Hz on average, which is pretty
fast and good enough for real time interactions.
The tracking system consists of three modules. The core
module is fingertip tracking which encapsulates the functions
of tracking algorithm and system input/output. Another module is the TUIO server. It is responsible for adding timestamp
to the multi-touch input sequences and package them based
on the TUIO protocol. The third module is the multi-touch
client. We use the open source program TUIO Smoke [24]
as the TUIO client. The distance from the virtual touch
surface to screen is set to 0.5 meter by default during system
initialization.
Two experiments are designed to examine the effectiveness
and accuracy of the fingertip tracking algorithm. In the first
VI. CONCLUSIONS AND FUTURE WORKS
In this paper, we have proposed a fast and robust method
to track the fingertips. Different from existing approaches,
this vision based method needs neither pressure sensing
devices nor extra marks for fingertip localization. With the
help of a stereovision system, 3D positions of the fingertips
are recovered by an efficient algorithm based on skin color
detection and geometry model analysis. The accuracy and
effectiveness of this algorithm has been verified by two
experiments performed on four video sequences with one to
four moving fingers under complex background, changing
lights as well as image noises. Although the geometry
model used in our algorithm sometimes suffers from the
inaccuracy of the structure representation for the hand and
1350
[5]
[6]
(a)
[7]
[8]
(b)
[9]
[10]
(c)
[11]
[12]
[13]
(d)
Fig. 7. Illustration of the tracking results when touching the virtual screen.
From (a) to (d): Tracking results for touch with one to four fingers. At each
row from left to right: the first frame in each test video sequence, the tracked
trajectories of the fingertips on the virtual touch screen, and the responses of
multi-touch client (different colors represent different fingertip trajectories).
[14]
finger, it works pretty well in most cases. In fact, the tracking
algorithm is so efficient that it is appropriate for real time
HCI tasks.
Hopefully this algorithm will be integrated into a visionbased 3D virtual multi-touch interaction system now under
development in our lab. The system can track 3D positions of
fingertips and recognize actions of fingers (pull, push, spin,
and twist etc.) over a virtual thin-film-like touch surface.
It is believed that this technology can greatly improve
user experience and has broad prospects in various HCI
applications.
Future work includes the improvement of the geometry
model. In addition, a regular web camera is more preferable
than a stereovision system because of the lower cost. It would
be interesting to improve the algorithm so that it can work
with single camera in the future.
[16]
R EFERENCES
[22]
[1] R. Chang, F. Wang, and P. You, “A survey on the development of
multi-touch technology,” in Wearable Computing Systems (APWCS),
2010 Asia-Pacific Conference on. IEEE, 2010, pp. 363–366.
[2] N. Metha, “A flexible machine interface,” MA Sc. Thesis, Department
of Electrical Engineering, University of Toronto, 1982.
[3] J. Han, “Multi-touch interaction wall,” in ACM SIGGRAPH 2006
Emerging technologies. ACM, 2006, p. 25.
[4] S. Hodges, S. Izadi, A. Butler, A. Rrustemi, and B. Buxton, “Thinsight: versatile multi-touch sensing for thin form-factor displays,” in
[23]
[15]
[17]
[18]
[19]
[20]
[21]
[24]
1351
Proceedings of the 20th annual ACM symposium on User interface
software and technology. ACM, 2007, pp. 259–268.
D. Wigdor, C. Forlines, P. Baudisch, J. Barnwell, and C. Shen, “Lucid
touch: a see-through mobile device,” in Proceedings of the 20th annual
ACM symposium on User interface software and technology. ACM,
2007, pp. 269–278.
A. Butler, S. Izadi, and S. Hodges, “Sidesight: multi-touch interaction
around small devices,” in Proceedings of the 21st annual ACM
symposium on User interface software and technology. ACM, 2008,
pp. 201–204.
E. Shen, S. Tsai, H. Chu, Y. Hsu, and C. Chen, “Double-side multitouch input for mobile devices,” in Proceedings of the 27th international conference extended abstracts on Human factors in computing
systems. ACM, 2009, pp. 4339–4344.
J. Han, “Low-cost multi-touch sensing through frustrated total internal
reflection,” in Proceedings of the 18th annual ACM symposium on
User interface software and technology. ACM, 2005, pp. 115–118.
J. Carey, T. Kimberley, S. Lewis, E. Auerbach, L. Dorsey, P. Rundquist,
and K. Ugurbil, “Analysis of fmri and finger tracking training in
subjects with chronic stroke,” Brain, vol. 125, no. 4, pp. 773–788,
2002.
Á. Cassinelli, S. Perrin, and M. Ishikawa, “Smart laser-scanner for 3d
human-machine interface,” in CHI’05 extended abstracts on Human
factors in computing systems. ACM, 2005, pp. 1138–1139.
N. Motamedi, “Hd touch: multi-touch and object sensing on a high
definition lcd tv,” in CHI’08 extended abstracts on Human factors in
computing systems. ACM, 2008, pp. 3069–3074.
L. Chi, L. Prada Gomez, R. Ryskamp, and S. Mavinkurve, “Wearable
heads-up display with integrated finger-tracking input sensor,” Jun. 19
2012, uS Patent 8,203,502.
H. Ying, J. Song, X. Ren, and W. Wang, “Fingertip detection and
tracking using 2d and 3d information,” in Intelligent Control and
Automation, 2008. WCICA 2008. 7th World Congress on. IEEE,
2008, pp. 1149–1152.
S. Kim, Y. Park, K. Lim, H. Lee, S. Kim, and S. Lee, “Fingertips
detection and tracking based on active shape models and an ellipse,”
in TENCON 2009-2009 IEEE Region 10 Conference. IEEE, 2009,
pp. 1–6.
T. Heap and D. Hogg, “Towards 3d hand tracking using a deformable
model,” in Automatic Face and Gesture Recognition, 1996., Proceedings of the Second International Conference on. IEEE, 1996, pp.
140–145.
Y. Sato, Y. Kobayashi, and H. Koike, “Fast tracking of hands and
fingertips in infrared images for augmented desk interface,” in Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE
International Conference on. IEEE, 2000, pp. 462–467.
S. Dominguez, T. Keaton, and A. Sayed, “Robust finger tracking for
wearable computer interfacing,” in Proceedings of the 2001 workshop
on Perceptive user interfaces. ACM, 2001, pp. 1–5.
Z. Mo and U. Neumann, “Real-time hand pose recognition using lowresolution depth images,” in Computer Vision and Pattern Recognition,
2006 IEEE Computer Society Conference on, vol. 2. Ieee, 2006, pp.
1499–1505.
H. Koike, Y. Sato, and Y. Kobayashi, “Integrating paper and digital
information on enhancedesk: A method for realtime finger tracking on
an augmented desk system,” ACM Transactions on Computer-Human
Interaction, vol. 8, no. 4, pp. 307–322, 2001.
D. Schlegel, A. Chen, C. Xiong, J. Delmerico, and J. Corso, “Airtouch:
Interacting with computer systems at a distance,” in Applications of
Computer Vision (WACV), 2011 IEEE Workshop on. IEEE, 2011, pp.
1–8.
R. Hsu, M. Abdel-Mottaleb, and A. Jain, “Face detection in color images,” Pattern Analysis and Machine Intelligence, IEEE Transactions
on, vol. 24, no. 5, pp. 696–706, 2002.
V. Vezhnevets, V. Sazonov, and A. Andreeva, “A survey on pixel-based
skin color detection techniques,” in Proc. Graphicon, vol. 3. Moscow,
Russia, 2003.
S. Suzuki et al., “Topological structural analysis of digitized binary
images by border following,” Computer Vision, Graphics, and Image
Processing, vol. 30, no. 1, pp. 32–46, 1985.
M. Kaltenbrunner, T. Bovermann, R. Bencina, and E. Costanza, “Tuio:
A protocol for table-top tangible user interfaces,” in Proc. of the The
6th Intl Workshop on Gesture in Human-Computer Interaction and
Simulation, 2005.
Download