Gesture Recognition

advertisement
An Introduction to Digital
Image Processing and
Applications in AI
Part 2: Hand and Face Tracking
Farhad Dadgostar
Applications of Hand, Face and body tracking




Gesture Recognition
(Sign language
recognition, HumanComputer Interaction)
Virtual Reality
Security and
Surveillance
Movement Analysis
Approaches to Hand and Face Tracking
and Gesture Recognition

Special hardware
(e.g. CyberGlove)


Marker tracking
Vision-based
approach
Main approaches to vision-based face and
hand tracking

Pattern recognition (Neural

Skin color segmentation
networks, Statistical analysis,
etc.)
 Have been successful in
face detection, but not in
hand detection because of
the different
representations of the
hand in 2D images
(Segmentation using different
color spaces)
 More successful in hand
tracking and gesture
recognition
Pixel-based approach to skin segmentation

Advantages



Very fast (a few computations per pixel)
Good candidate for real-time tracking
Disadvantages


Background noise may cause this technique to
be impractical
Can not distinguish other objects that have
similar features to skin color
Methods of implementation

Using a set of training data for indicating a region in the color
space, and using that region for detection.




RGB, RG
ICrCb, CrCb
HSV, HSI, HS, H (Hue, Saturation and Intensity)
IUV
Requires less computation

Recognizing an object (e.g. Hand or Face) and using the color
of the object for detecting skin in the image

e.g. Face detection using Viola-Jones method
Requires more computation
Advantages of pixel based skin detection based on Hue
Factor
1) It is one dimensional,
therefore requires
just 2 thresholds for
specifying the skin
color region
2) It is robust against
intensity changes
3) Different skin colors
are located in the
same range of the
Hue factor
Hue histogram of the skin
Disadvantages of pixel based skin detection based
on Hue Factor

Same as other color-based techniques, background
noise may cause a high percentage of false
detection
The idea

A Global Skin Detector can be implemented using
hue thresholding

The hue thresholds of each person’s skin color, is
located in between the thresholds of the global
skin detector
Detecting in-motion skin pixels in a video
sequence can be helpful to estimate the local skin
thresholds

The algorithm
Step 1: Selecting Candidate Skin Pixels

Extracting the candidate pixels using
Global Skin Detector (output has a high
percentage of correct detection and may have a
high percentage on false detection)
The algorithm
Step 2: Detecting In-motion Skin Pixels

Extracting in-motion pixels that
potentially belong to the skin using
frame subtraction
The algorithm
Step 3: Retraining

Retraining the local skin detector (that
initially has the same parameters of global
skin detector), based on the data
extracted in the previous step.
Hn+1 = (1-A)*Hn + A*HM



Hn is the training histogram of the local skin detector
HM is the histogram of the in-motion pixels detected by
global skin detector
A is merging factor (a small value around 0.05)
The algorithm
Step 4: Filtering skin pixels

true if T L
f (I )  

 false else
H   I  T H 
n
TL is Lower Hue threshold
 TU is Upper Hue threshold

U
n


Adaptive skin detector (overview)
Behavior of the algorithm

The local skin detector adapts itself to the color of the skin
in the image sequence, and improves with time
Frame 1
Frame 1400
Frame 2700
Changes in correct and false detection
Correct detection over the time
1
0.9
0.8
Correct
detection
Accuracy
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
25
50
75
100
125
150
175
200
225
250
275
300
325
350
375
400
425
450
475
500
525
550
575
600
625
400
425
450
475
500
525
550
575
600
625
Frame No.
False detection during the tim e
0.16
0.14
0.12
False
detection
0.1
0.08
0.06
0.04
0.02
0
25
50
75
100
125
150
175
200
225
250
275
300
325
350
375
Simple boundary detection
Presence of noise
Finding the biggest blob
3D representation of the counting
Window size 21x21
Window size 60x60
Number of required computations for the
simple method



Image Size 640x480
Window size 51x51
Frame rate 25fps
640x480x51x51x25 ≈ 20,000,000,000 !
The Mean-Shift Algorithm
 Centre
of Gravity
(General)
 Centre
(2D)
 Zeroth
of Gravity
Moment
The Mean-Shift Algorithm (Summary)





1) Consider a search window on an arbitrary
point of the space
2) Assume that the value of the function for
each point of the search window represents the
mass of that point
3) Compute the centre of gravity of the search
window
4) Shift the search window such that its centre
to be matched to the position of the centre of
gravity
5) Repeat from step 2 until convergence
Tracking the face
Choosing Kernel’s size
The CAM-Shift Algorithm

Choosing the Kernel’s size
h / w = 1.2
w = sqrt(k * M00)
Boundary detection problem in CAM-Shift Algorithm
Boundary detection
Enlarge
 Shrink
 No Change

Boundary
information
Kernel resizing: Edge-density linear
ni = calculate the density of the boundary of the
kernel
 if (ni > UpperThreshold) Then
 enlarge the kernel by 1
 elseif (ni < LowerThreshold) Then
 shrink the kernel by 1
 else
 // no resize
 endif

Kernel resizing: Edge-density Fuzzy
Convergence speed of these three
approaches
Boundary detection in noisy environment
Face Tracking
Question?
For more information:
Email: F.Dadgostar@massey.ac.nz
Website: www.massey.ac.nz/~fdadgost
Download