Precise Object Tracking under Deformation Prepared by: Eng. Mohamed Hassan, EAEA Supervised by: Prof. Dr. Hussien Konber, Al Azhar University Prof. Dr. Mohamoud Ashour, EAEA Dr. Ashraf Aboshosha, EAEA Submitted to: Communication & Electronics Dept., Al Azhar University 1 Outlines Key subjects of this framework include: 2 Motivation Visual tracking applications Block diagram of object tracking system Image deformation types Object extraction Morphological operations Geometrical Modeling and pose estimation Conclusion and Future Work Motivation The main objectives of this research work are to: Overcome the imprecision in object tracking caused by different deformation sources such as noise, change of illumination, blurring, scaling and rotation. Developing a three dimensional (3D) geometrical model to determine the current pose of an object and predict its future location based on FIR model Presenting a robust ranging technique to track a visual target instead of the traditional expensive ranging sensors. 3 Visual Tracking Applications The precise object tracking is an essential issue in several applications such as: Robot vision Automated surveillance (civil and military) Medical applications Satellite and space systems Traffic systems Security etc. 4 Block Diagram of Object Tracking System Video Camera Frame grabber PC USB Camera USB Bus Image Acquisition Image Processing Output Target 5 Image Deformation Types Noise. Scaling &Rotation. Blurring Change of illumination. 6 Image Deformation: Noise Definition: is considered to be any measurement that is not part of the phenomena of interest. Images are affected by different types of noise: Gaussian noise Salt and Pepper noise Poisson Noise Speckle Noise 7 Image De-noising Techniques The following digital filters have been employed for denoising Linear filter (Average filter, Gaussian filter and unsharp filter) Non linear filter (Median filter and Adaptive filter) Coiflet Wavelets Proposed filter 8 Spatial Filters Spatial filtering term is the filtering operations that are performed directly on the pixels of an image. The process consists simply of moving the filter mask from point to point in an image. At each point (x,y) the response of the filter at that point is calculated using a predefined relationship. 9 Linear Spatial Filters Pixels of image f(x,y) w(0,1) The result is the sum of products of the mask coefficients with the corresponding pixels directly under the mask f(x,y+1) w(1,0) w(1,1) Mask coefficients w(-1,-1) w(-1,0) w(-1,1) f(x-1,y-1) f(x-1,y) f(x-1,y+1) w(0,-1) w(0,0) w(1,-1) f(x,y-1) f(x+1,y-1) f(x+1,y) f(x+1,y+1) 10 w(-1,-1) w(-1,0) w(-1,1) w(0,-1) w(0,0) w(0,1) w(1,-1) w(1,0) w(1,1) Nonlinear Spatial Filters Nonlinear spatial filters also operate on neighborhoods, and the mechanics of sliding a mask past an image are the same as was just outlined. The filtering operation is based conditionally on the values of the pixels in the neighborhood under consideration. Order-statistics filters are nonlinear spatial filters whose response is based on ordering (ranking) 11 Wavelet Transform The Wavelet transform is a multiresolution analysis tool which decomposes a signal into different frequency sub bands. Wavelet transform, due to its excellent localization, has rapidly become an indispensable signal and image processing tool for a variety of applications. Wavelet denoising attempts to remove the noise present in the signal while preserving the signal characteristics, regardless of its frequency content. 12 Wavelet Transform Figure 1 The two-dimensional FWT - the analysis filter 13 Figure 2 Two-scale of two-dimensional decomposition Denoising Proposed Filter The proposed filter is a cascaded spatial filter based on median fitter and Coiflet wavelets. Its edge-preserving nature makes it useful in cases where edge blurring is undesirable. It is very useful in real object tracking. This filter is the best one for removing all types of noise I/p image Median filter Coiflet Wavelets O/p image Figure 3 Cascaded spatial filter based on median fitter and Coiflet wavelets 14 Image Similarity Measure To validate the efficiency of the previous digital filters the following similarity measures have been applied 2D Cross Correlation i n [ x i m x * y i m y ] i 1 i n x i mx i 1 i n yi 2 my 2 i 1 Peak Signal-to-Noise Ratio (PSNR)dB M ax I PS N R 20 log 10 M S E 15 M SE 1 mn m 1 i 0 n 1 P I i , j k (i , j ) P 2 j 0 2D Cross Correlation Unsharp filter Average filter Gaussian Median filter filter Adaptive filter Proposed filter Salt and paper noise 0.9234 0.9890 0.6983 0.9809 0.7804 0.9984 Gaussian noise 0.5651 0.9861 0.9446 0.9701 0.9701 0.9876 Poisson noise 0.8270 0.9920 0.9900 0.9910 0.9913 0.9961 Speckle noise 0.6349 0.9879 0.7737 0.8341 0.8547 0.9871 Table 1. 2D cross correlation similarity measure 16 Peak Signal-to-Noise Ratio (PSNR)dB Salt and paper noise Unsharp Average filter filter Gaussian filter Median filter Adaptiv e filter Proposed filter 18.59 25.49 36.00 22.97 49.48 26.16 23.80 26.42 26.79 32.80 Gaussian 9.94 noise 27.37 Poisson noise 14.74 28.71 30.21 31.92 32.80 43.16 Speckle noise 10.86 26.73 25.38 26.71 27.59 37.67 Table 2. PSNR similarity measure 17 Scaling & Rotation Definition: Scaling & rotation is affine Transformation where Straight lines remain straight, and parallel lines remain parallel. Scaling and Rotation: The linear transformation and radon transformation have been used to recover an image from a rotated and scaled origin. 18 Scaling & Rotation Original image Scaled image 19 Scaled &rotated image Figure 4 Rotated and scaled image Linear Transformation Figure 5 Control point selection 20 Linear Transformation Original image Scaled & rotated image 21 recovered image Figure 6 Recovered by using linear transformation Radon Transformation Radon transform: This transform is able to transform two dimensional images with lines into a domain of possible line parameters, where each line in the image will give a peak positioned at the corresponding line parameters. Projections can be computed along any angle θ, by use general equation of the Radon transformation: R x f x , y x cos y sin x dy dy w here , . is the delta function x' is the perpendicular distance of the beam from the origin and θ is the angle of incidence of the beams. 22 Radon Transformation Original image Edge detection 23 Edge linking Figure7 Canny edge detection and edge linking Radon Transformation 24 Figure 8 Radon transform projections along 180 degrees, from -90 to +89 Radon Transformation Original image Rotated image 25 recovered image Figure 9 Recovered by using radon transform Blurring Blurring: degradation of an image can be caused by motion There are two types of blurring Known blurring: the length and the angle of blurring are known Unknown blurring: the length and the angle of blurring are unknown 26 Deblurring Techniques A blurred or degraded image can be approximately described by this equation g= 27 H f + n Deblurring using Wiener filter Deblurring using a regularized filter Deblurring using Lucy-Richardson algorithm Deblurring using blind deconvolution algorithm Deblurring using the Blind Deconvolution Algorithm Figure 10 Deblurring using the blind deconvolution algorithm 28 Deblurring Techniques (a) Blurred image (c)Deblurred image 29 (b) Person detection under motion deformation (d) Person detection in deblurred image Figure 11, Capability of object tracking under blurring (a, b) with known blur function and after deblurring (c, d Deblurring Techniques Blurred image correlation with original one 30 Deblurred image using correct parameters correlation Deblurring Techniques Deblurred image using longer PSF correlation Deblurred image using different angle correlation Figure 12, 2D cross correlation with the deblurring form 31 Deblurring Techniques Correlation Condition 32 blurred image with the original one 0.0614 deblurred image with the original one using correct parameters 0.3523 deblurred image with the original one using longer PSF 0.0558 deblurred image with the original one using different angle 0.1231 Table 3, 2D cross correlation with the deblurring form Change of Illumination Change of illumination Color model deformation may happen due to the change in illumination Proposed solution Selecting an appropriate color model (RGB, HSV or ycbcr) to overcome the deformation problem 33 RGB Representation A Representation of additive color mixing The RGB color model mapped to a cube Weak points of the RGB color model RGB color model is affected by the change of illumination RGB is non uniform color model 34 HSV Representations Hue, saturation and intensity are often plotted in cylindrical coordinates with hue the angle, saturation the radius, and intensity the axis. conical representation of the HSV 35 The cylindrical representation of the HSV HSV color wheel YCbCr Color Model Chrominance is defined as the difference between a color and a reference white at the same luminance. The conversion from RGB to YCbCr Y C b C r 0.504 0.098 R 16 0.257 0.148 0.291 0.439 G 128 0.439 0.368 0.071 B 128 The conversion from YCbCr to RGB R G B 36 1.598 Y 16 1.164 0.000 1.164 0.329 0.813 C b 128 1.164 2.017 0.000 C r 128 Advantage of YCbCr The main advantages of this model are: The luminance component (Y) of YCbCr is independent of the 37 color The skin color cluster is more compact in YCbCr than in other color space YCbCr has the smallest overlap between skin and non-skin data in under various illumination conditions. YCbCr is broadly utilized in video compression standards YCbCr is a family of color spaces used in video systems. YCbCr is one of two primary color spaces used to represent digital component video (the other is RGB). Object Extraction To track a visual target we have to relay on a segmentation technique such as: Thresholding Clustering Region growing Edge-based Physical model-based Frame Subtraction Fast block matching Throughout this framework a color table thresholding segmentation technique has been applied to extract the visual target 38 Homogeneous Object Extraction Original image Tracked object sample 39 Homogeneous Object Extraction Original image sample RGB 40 HSV YCbCr Figure 13, Comparison of homogeneous object extraction Inhomogeneous Object Extraction Original image Tracked object sample 41 Inhomogeneous Object Extraction Original image sample RGB HSV YCbCr Figure 14, Comparison of inhomogeneous object extraction 42 Morphological operations The most basic morphological operations are dilation and erosion Dilation adds pixels to the boundaries of objects in an image. Expand/enlarge objects in the image Fill gaps or bays of insufficient width Fill small holes of sufficiently small size Connects objects separated by a distance less than the size of the window Erosion removes pixels on object boundaries. to erode away the boundaries of regions of foreground pixels (i.e. white pixels, typically). Thus areas of foreground pixels shrink in size, and holes within those areas become larger 43 Morphological operations Opening and Closing are morphological operations which are based on dilation and erosion. Opening smoothes the contours of objects, breaks narrow isthmuses and eliminates thin protrusions. Closing also produces the smoothing of sections of contours but fuses narrow breaks, fills gaps in the contour and eliminates small holes. Opening is basically erosion followed by dilation while closing is dilation followed by erosion. 44 Morphological operations Binary object Binary object after dilation holes 45 Binary after removing extra pixel Binary object after closing Figure 15, The effect of the morphological operation Morphological operations 46 Figure 16, Center of gravity, ellipse fitting and bound box of an image Geometrical Modeling 47 Figure 17 object tracking at different distance Geometrical Modeling The relation between distance (D) and no of pixel (N) N ae 48 bD Where, a = 30606.621 b=-0.03410108 Figure 18. The relation between range (D) and projection size (N) Geometrical Modeling The relation between the range and location of the object in 3D domain Figure 19. The relation between the range and location of the object in 3D domain 49 Motion Estimation and Prediction based on FIR Figure 19, FIR model structures y t n i 1 50 au t i e t a u t e t T Motion Estimation and Prediction based on FIR Figure 20, Models output w.r.t system output 51 Motion Estimation and Prediction based on FIR Figure 21 Model output w.r.t system output 52 Motion Estimation and Prediction based on FIR Figure 22 The capability of the model to predict the output if the system input is known 53 Conclusion and Future Work Throughout this framework the following academic tasks have been achieved Developing a novel Universal filter for image denoising Selecting qualitative radon transformation for correction of the rotation Intensive comparative study for dealing with kwon/unknown bulrring Employing a color table thresholding segmentation technique on YCbCr to extract the visual target 3D Geometrical modeling for estimation and prediction of target pose As a future work, we are going to implement the applied algorithm on an embedded system to develop a visual RADAR 54 System 55