An Introduction to Digital Image Processing and Applications in AI Part 2: Hand and Face Tracking Farhad Dadgostar Applications of Hand, Face and body tracking Gesture Recognition (Sign language recognition, HumanComputer Interaction) Virtual Reality Security and Surveillance Movement Analysis Approaches to Hand and Face Tracking and Gesture Recognition Special hardware (e.g. CyberGlove) Marker tracking Vision-based approach Main approaches to vision-based face and hand tracking Pattern recognition (Neural Skin color segmentation networks, Statistical analysis, etc.) Have been successful in face detection, but not in hand detection because of the different representations of the hand in 2D images (Segmentation using different color spaces) More successful in hand tracking and gesture recognition Pixel-based approach to skin segmentation Advantages Very fast (a few computations per pixel) Good candidate for real-time tracking Disadvantages Background noise may cause this technique to be impractical Can not distinguish other objects that have similar features to skin color Methods of implementation Using a set of training data for indicating a region in the color space, and using that region for detection. RGB, RG ICrCb, CrCb HSV, HSI, HS, H (Hue, Saturation and Intensity) IUV Requires less computation Recognizing an object (e.g. Hand or Face) and using the color of the object for detecting skin in the image e.g. Face detection using Viola-Jones method Requires more computation Advantages of pixel based skin detection based on Hue Factor 1) It is one dimensional, therefore requires just 2 thresholds for specifying the skin color region 2) It is robust against intensity changes 3) Different skin colors are located in the same range of the Hue factor Hue histogram of the skin Disadvantages of pixel based skin detection based on Hue Factor Same as other color-based techniques, background noise may cause a high percentage of false detection The idea A Global Skin Detector can be implemented using hue thresholding The hue thresholds of each person’s skin color, is located in between the thresholds of the global skin detector Detecting in-motion skin pixels in a video sequence can be helpful to estimate the local skin thresholds The algorithm Step 1: Selecting Candidate Skin Pixels Extracting the candidate pixels using Global Skin Detector (output has a high percentage of correct detection and may have a high percentage on false detection) The algorithm Step 2: Detecting In-motion Skin Pixels Extracting in-motion pixels that potentially belong to the skin using frame subtraction The algorithm Step 3: Retraining Retraining the local skin detector (that initially has the same parameters of global skin detector), based on the data extracted in the previous step. Hn+1 = (1-A)*Hn + A*HM Hn is the training histogram of the local skin detector HM is the histogram of the in-motion pixels detected by global skin detector A is merging factor (a small value around 0.05) The algorithm Step 4: Filtering skin pixels true if T L f (I ) false else H I T H n TL is Lower Hue threshold TU is Upper Hue threshold U n Adaptive skin detector (overview) Behavior of the algorithm The local skin detector adapts itself to the color of the skin in the image sequence, and improves with time Frame 1 Frame 1400 Frame 2700 Changes in correct and false detection Correct detection over the time 1 0.9 0.8 Correct detection Accuracy 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 25 50 75 100 125 150 175 200 225 250 275 300 325 350 375 400 425 450 475 500 525 550 575 600 625 400 425 450 475 500 525 550 575 600 625 Frame No. False detection during the tim e 0.16 0.14 0.12 False detection 0.1 0.08 0.06 0.04 0.02 0 25 50 75 100 125 150 175 200 225 250 275 300 325 350 375 Simple boundary detection Presence of noise Finding the biggest blob 3D representation of the counting Window size 21x21 Window size 60x60 Number of required computations for the simple method Image Size 640x480 Window size 51x51 Frame rate 25fps 640x480x51x51x25 ≈ 20,000,000,000 ! The Mean-Shift Algorithm Centre of Gravity (General) Centre (2D) Zeroth of Gravity Moment The Mean-Shift Algorithm (Summary) 1) Consider a search window on an arbitrary point of the space 2) Assume that the value of the function for each point of the search window represents the mass of that point 3) Compute the centre of gravity of the search window 4) Shift the search window such that its centre to be matched to the position of the centre of gravity 5) Repeat from step 2 until convergence Tracking the face Choosing Kernel’s size The CAM-Shift Algorithm Choosing the Kernel’s size h / w = 1.2 w = sqrt(k * M00) Boundary detection problem in CAM-Shift Algorithm Boundary detection Enlarge Shrink No Change Boundary information Kernel resizing: Edge-density linear ni = calculate the density of the boundary of the kernel if (ni > UpperThreshold) Then enlarge the kernel by 1 elseif (ni < LowerThreshold) Then shrink the kernel by 1 else // no resize endif Kernel resizing: Edge-density Fuzzy Convergence speed of these three approaches Boundary detection in noisy environment Face Tracking Question? For more information: Email: F.Dadgostar@massey.ac.nz Website: www.massey.ac.nz/~fdadgost