Proceedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conference Toronto, Canada, September 17-20, 2006 MC4.2 A Low-cost Occlusion Handling Using a Novel Feature In Congested Traffic Images M. Sadeghi and M. Fathy Abstract—Occlusion detection is critical for robust and reliable vision-based systems for traffic flow analysis. In this paper we discuss novel occlusion detection approach. The goal of this algorithm is to detect vehicles in congested traffic scenes with occlusion handling. Also we proposed a new proper feature for vehicle detection to avoid the merging of two or more objects into one and to improve the accuracy of object localization. This feature is strong shadow which exists under any vehicle. Many sequences of real images including noisy images, cluttered, and images containing remarkable shadows and occlusions, taken from both scenes of outdoor highways and city roads, have been taken for evaluating our method. I I. INTRODUCTION NTELLIGENT traffic surveillance systems are assuming an increasingly important role in highway monitoring and city road management systems. Their purpose, amongst other things, is to provide statistical data on traffic activity such as monitoring vehicle density and signaling potentially abnormal situations. This paper addresses the problem of vehicle segmentation in traffic images including vehicle occlusion. Some of the related works for analyzing surveillance images, based on background subtraction methods [1], [2]. Also [3-5] have some methods for occlusion handling in the video data stream. Occlusion detection can be performed using an extended Kalman filter that predicts position and size of object bounding regions. Any discrepancy between the predicted and measured areas can be used to classify the type and extent of an occlusion [4], [5]. The early attempts to solve the occlusion problem involved simple thresholding, while later methods applied energy minimization [6] or Markov random field (MRF) models [7]. In order to find the number of motion classes, K-means clustering [8] and mixture models [9] were used. More recently, motion segmentation methods based on active contours [10] have been proposed. The original activecontour formulation [11] suffers from stability problems and fixed topology. These issues can be resolved by embedding the contour into a higher-dimensional function of which it is a zero-level set [12]. An alternative is to consider all intensities in a region, as proposed by Mansouri and Konrad Maryam Sadeghi, B.Sc. Student, Computer Engineering Department, Iran University of Science and Technology, Narmak, Tehran, Iran, (e-mail: ma_sadeghi@comp.iust.ac.ir) Mahmoud Fathy, Associated Professor Computer Engineering Department, Iran University of Science and Technology, Narmak, Tehran, Iran, (e-mail: mahfathy@iust.ac.ir). 1-4244-0094-5/06/$20.00 ©2006 IEEE [13], Jehan-Besson et al [14], and Debreuve et al.[15]. Some early works using multiple frames includes motion detection using 3-D MRF models [16] and "video-cube" segmentation based on marker selection and volume growing [17]. The other concept of object tracking as spatio-temporal boundary detection has been proposed by El-Feghali et al. [18] and [19]. This new multiple-image framework has lead to interesting space-time image sequence segmentation methods developed by Mansouri et al.[20], [21] and, independently, by the authors [22]. In [23] they employed a straightforward blob tracking algorithm. The advantage of part-based methods is shown in [24], and the algorithm known as predictive Trajectory Merge-and-Split (PTMS) in [25], has been developed to detect partial or complete occlusions during object motion. Traditionally, shadow detection techniques have been employed for removing shadows from background and foreground but in our work, we used it as a key feature to vehicles detection and occlusion handling. In our approach shadow is a key feature and plays very important positive role. Shadows provide relevant information about the scene represented in an image or a video sequence. Until now shadow was a problem for good object detection and shadows cause the erroneous segmentation of objects in the scene but now we use special type of shadow (strong shadow which exists under vehicle) as a key feature for vehicle detection and occlusion handling. The problem of shadow detection has been increasingly addressed over the past years. Shadow detection techniques can be classified into two groups: model-based and property-based techniques. Model-based techniques are designed for specific applications, such as aerial image understanding [26] and video surveillance [27]. Luminance information is exploited in early techniques by analyzing edges [28], and texture information [29]. Luminance, chrominance and gradient density information is used in [30]. Color information is used also in [31]. A physics-based approach to distinguish material changes from shadow boundaries in chromatic still images is presented in [32]. Authors of [32] proposed Shadow-aware object-based video processing in [37]. A classification of color edges by means of photometric invariant features into shadow-geometry edges, highlight edges, and material changes is proposed in [33]. In [34] cast shadow segmented using invariant color features. In our method, contrary of previous algorithms, shadow is useful feature to detect vehicles. We use photometric 522 Authorized licensed use limited to: SIMON FRASER UNIVERSITY. Downloaded on March 17,2010 at 14:03:01 EDT from IEEE Xplore. Restrictions apply. characteristic of darkest pixels of strong shadows in traffic images for occlusion handling. I. THE PROPOSED APPROACH During occlusion, visual features of the occluded objects are not observed and the objects cannot be tracked. We propose a two step approach to handle the occlusions. The first step detects the novel feature for detecting vehicle, and the second step convert picture format to binary with morphologic techniques and suitable filters. A. Novel Feature Our approach is directly based upon the physical characteristics of shadow pixels. The part of pixels that are observable under vehicle and between wheels of it is proposed as a novel feature. This strong shadow is the darkest part of image and other shadows doesn't influence in our algorithm because we use an intelligent low-cost approach to determine useful shadow. Fig.1 shows the shadow feature existent in any vehicle image taken from a camera which is used for traffic monitoring. Photometric and spectral properties of shadows To describe the spectral appearance of a surface in shadow, let us consider the physics of color generation. The appearance of a surface is the result of the interaction among illumination, surface reflectance properties and the responses of a chromatic mechanism. This chromatic mechanism is composed of three color filters in a color camera. To model the physical interaction between illumination and surface of objects, let us consider the Dichromatic Reflection Model [34]-[35]. The radiance [34-36] of the light, Lr(Ȝ ,p), reflected at a given point p on a surface in the 3D space, given some illumination and viewing geometry, is formulated as [34] Where La(Ȝ), Lb(Ȝ, p), Ls(Ȝ, p) are the ambient reflection term, the body reflection term, and the surface reflection term, respectively and Ȝ is the wavelength. The ambient illumination term is assumed to account for all the light indirectly reflected among surfaces in the environment and does not vary with geometry. If there is no direct illumination because an object is obstructing the direct light, then the radiance of the reflected light is (a) (b) Fig.1. (a) and (b), two different views of our proposed shadow feature In Fig.2 we distinguished this feature in traffic images B. Theoretical Background What is a shadow? A shadow occurs when an object partially or totally occludes direct light from a source of illumination. Shadows can be divided into two classes: self and cast shadows. A self shadow occurs in the portion of an object which is not illuminated by direct light. A cast shadow is the area projected by the object in the direction of direct light. In the following, the relationship between shadows and lit regions is formalized in order to derive relevant shadow properties. Which Lr(shadow) represents the intensity of the reflected light at a point in a shadow region. Let SR(Ȝ), SG(Ȝ) and SB(Ȝ) be the spectral sensitivities of the red, green, and blue sensors of a color camera, respectively. The color components of the reflected intensity reaching the sensors at a point (x, y) in the 2D image plane are Where are the sensor responses, E (Ȝ, x, y) is the image irradiance [36] at (x, y), and The interval of summation is determined by Sci(Ȝ), which is non-zero over a bounded interval of wavelengths Ȝ. Since image irradiance is proportional to scene radiance [34],[36], for a pixel position (x, y), representing a point p in direct light, the sensor measurements are Fig.2. the shadow feature is observable in real traffic images Giving a color vector C(x, y)lit=(Rlit, Glit, Blit). Į is the proportionality factor between radiance and irradiance. For a 523 Authorized licensed use limited to: SIMON FRASER UNIVERSITY. Downloaded on March 17,2010 at 14:03:01 EDT from IEEE Xplore. Restrictions apply. point in shadow the measurements are In which C(x, y)shadow = (Rshadow, Gshadow, Bshadow). It follows that each of the three RGB color components, if positive and not zero, decreases when passing from a lit region to a shadowed one, that is Rshadow < Rlit Gshadow < Glit Bshadow < Blit Ambient light can have different spectral characteristics with respect to direct light [32, 34]. The intensity differences between shadow and other parts of surface is one of the photometric characteristics. In traffic scenes, sun is the light source and vehicles are strong obstacles against sun. We can model the under-vehicle region which does not receive any direct sun light with above formulas. It means that pixels of shadow in under-vehicle region have just La(Ȝ) and have very low intensity. All other surfaces in a traffic scene, even surfaces of dark objects such as black or dark color vehicles, have the body reflection Lb(Ȝ ,p) and the surface reflection Ls(Ȝ ,p). So this region has different spectral properties. Also other shadows in traffic scene have Ls(Ȝ ,p) and more intensity than under vehicle shadows. We will show in the paper that these shadows will be eliminated properly. The size and direction of shadow varies as a result of movement of the light source (sun) in daytime and traffic images are affected therein. So, we need a robust approach to vehicle detection and occlusion handling that effectively works regardless of light source position. The region under vehicle has a very strong shadow since vehicles are very strong obstacles and this region has only ambient light. Therefore because of the low dependency of this feature to light source, sun position doesn't have any effect on it and this region always has the lowest intensity among image pixels. This property has been shown in Fig.3. Fig.4. Camera viewpoint for feature detection For determining our mentioned feature and detecting vehicles in traffic scenes, viewpoint of camera is very important and effective. If its angel with horizontal line (ș) is 90 degree, this algorithm can not work properly. Therefore we should regulate camera position and viewpoint angle so that this feature can be observable. For internal camera calibrating, we can modify the detection region by regulating Į angle which is a property of CCD Camera and for external calibrating, height and ș angle should be regulated. Most of the cameras used for traffic monitoring or controlling satisfy these conditions. C. Colormaps Because of very low intensity of feature region, image format converting from index to gray with specific colormap can find out this feature. A colormap is equivalent to paint-by-numbers and is a table which contains a small set of color definitions. These colors are the only ones that are used in the image being produced and the application maps some value at a pixel to an allowed color from the colormap. Matlab has several colormaps that have 64 levels, each level containing the RGB values that define the color. In Fig.5 cool colormap of Matlab has been shown. Fig. 5. Cool colormap of Matlab Fig.3. Very low dependency of novel feature to light source position The key element in detection of this region is the camera viewpoint. The camera is assumed to be calibrated suitably to detect the strong shadow of the vehicle. A view of the camera position is shown in Fig.4. Colormap is a common scalar visualization technique that maps scalar data to colors. Colormaps provide a convenient method for specifying colors through their RGB components. If intensity of some pixels is very low, we can find out these pixels with colormap converting . Many default colormaps are available in Matlab and also we can customized them. Colormaps can be modified or created with Matlab Colormap Editor. Scalar values serve as 524 Authorized licensed use limited to: SIMON FRASER UNIVERSITY. Downloaded on March 17,2010 at 14:03:01 EDT from IEEE Xplore. Restrictions apply. indices into a color look up table, i.e., colormap 0 s i < min, ­ ° i = ® n−1 s i > max, ° ( n − 1 )( s − min) /(max − min) otherwise . i ¯ Where n is the length of colormap, typically, n = 256, the scalar is in [min, max]. Our experiments demonstrated that the best method to show different spectral properties of novel feature (strong under-vehicle shadow) is format conversion with specific colormap. Because, pixels of this part of image have the lowest intensity of image pixels. After converting, it can be shown with minimum peak of colormap. In section B we proved that Rshadow < Rlit Gshadow < Glit Bshadow < Blit Therefore pixels of strong shadow have minimum amount of RGB values and they are in down part of Index image table. When we convert traffic image form Index format to Grayscale format, minimum values of Index image will be mapped with minimum peak of colormap. So we can distinct the region which includes our feature. For showing this fact, we set some of our experiments result in Fig .6. No lighting Flat lighting Gouraud lighting Phong lighting (a) (b) Fig.7 (a) Original RGB image, (b) Grayscale image using JET colormap (Index Æ Grayscale) After determining this strong shadow, the resulted image is converted into a binary image using a proper threshold. Then a noise suppression algorithm is applied on the binary image and blobs are counted. Due to perspective, the shadows in the end of image are too small or segmented. This problem can be resolved by filling the holes, removing the narrow protrusions and applying morphologic techniques such as Dilation, Erosion, Closing and Opening. Also instead of converting RGB image from Index format to Grayscale format, we can directly convert RGB into Grayscale, Fig.8. (a) (b) Fig.8 (a) Original RGB image, (b) Grayscale image using JET colormap (RGB Æ Grayscale) (a) (b) (c) (d) As it is obvious in Fig.7 and Fig.8, the under-vehicle shadow can be used as an effective feature in vehicle detection. This feature is also useful in occlusion detection. Our detection is more reliable when using customized colormaps. Colormaps can be modified or created by Matlab Colormap Editor. In Fig.9 the final result of algorithm is illustrated. In this figure we have used a customized colormap, which is modified from VGA colormap. Fig.6. influences of format conversion from Index to Grayscale with different colormaps of Matlab, (a) four object which is created with Matlab under four different lighting styles (none, flat, Phong, Gouraud), (b) Index to Grayscale conversion with Winter colormap, (c) Index to Grayscale conversion with Pink colormap, (d) Index to Grayscale conversion with Hot colormap, (a) When a RGB image is converted from Index to Grayscale using a specific colormap, the different spectral property of under-vehicle region can be precisely observable. We executed this process on traffic scenes in Fig.7. (b) (c) Fig.9 (a) Original RGB image, (b) Grayscale image using customized colormap, (c) Final result of algorithm 525 Authorized licensed use limited to: SIMON FRASER UNIVERSITY. Downloaded on March 17,2010 at 14:03:01 EDT from IEEE Xplore. Restrictions apply. D. Experimental Results To demonstrate the effectiveness of our algorithm, we have applied it to several real images. The input images are masked to determine the framework field. There may be situations where two vehicles in close proximity were first segmented correctly, but later segmented as one blob, due to partial occlusion or errors in the background image. Also, there may be cases where parts of a large vehicle (for example, a container carrier) are represented by separate blobs. These and similar situations cause inaccuracies in measured traffic parameters. Therefore, in order to have more accurate detection, first we increase image contrast then morphologic techniques are applied. Long sequences of real images (noisy, cluttered, and containing remarkable shadows and occlusions) taken from both scenes of outdoor highways and city roads have been considered. The first image in Fig.10 shows a single vehicle along a road, whereas the second one Fig.11 shows two vehicles under occlusion that our approach could detect them and in Fig.12 there is a group of occluded vehicles in congested traffic which are detected accurately. Finally, Fig.13 shows a bus in a road that is detected such as other vehicles. (a) (c) (d) (d) (b) (e) (c) (f) II. CONCLUSION Fig.10(a) Original image, (b) Masked image,(c) Grayscale converted from Index image using Cool colormap, (d) Detected vehicle (a) (b) Fig.13 (a) bus image, (b) Masked image, (c) Increased contrast, (d) Grayscale converted from Index image using Cool colormap, (e) Grayscale converted from Index image using customized colormap, (f) Binary detected vehicles (b) (c) (d) Fig.12 (a) Congested traffic scene, (b) Grayscale converted from Index image using Cool colormap, (c) Binary image, (d) Detected vehicles (a) (a) (b) (c) Fig.11 (a) Two Occluded Vehicles, (b) Grayscale converted from Index image using Flag colormap, (c) Detected vehicles In this paper we discuss novel occlusion detection approach. We proposed a new proper feature for detecting vehicle and improving the accuracy of object localization. This feature is strong shadow which exists under any vehicle. Our novel approach to detecting occluded vehicles has following simple and efficient steps: 1. preprocessing (increasing image contrast) 2. Convert image format from Index to Grayscale 3. Convert image format from Grayscale to Binary 4. Perform suitable morphologic techniques This approach has several advantages. It is very simple and low-cost and can help other detection algorithms for occlusion handling. Accuracy of this approach in real images including noisy images, cluttered, and images containing remarkable shadows and occlusions, is very good. 526 Authorized licensed use limited to: SIMON FRASER UNIVERSITY. Downloaded on March 17,2010 at 14:03:01 EDT from IEEE Xplore. Restrictions apply. III. FUTURE WORKS Future work involves making the algorithms more robust for occlusion and improving its performance. The current results are encouraging and further work is being done to make the system more practical. Work can be done for enhancing the algorithms more suited to the non-static cameras in intelligent vehicles. This system if coupled with other algorithms for detecting and classification, can help in solving occlusion problem and thus help in accurate traffic monitoring and controlling. REFERENCES [1] Gutchess D., Trajkovics M., Cohen-Solal E., Lyons D., Jain A.K, "A background model initialization algorithm for video surveillance", in proc. IEEE ICCV 2001, Pt.1 (2001) pp. 744-740. [2] O. Javed, K. Shafique, M. Shah, "A hierarchical approach to robust background subtraction using color and gradient information", in Proc. Workshop on Motion and Video Computing, (2002) pp. 22-27. [3] R. Cuchiara, C. Grana, M. Piccardi, A. Prati," Detecting objects, shadows and ghosts in video streams by exploiting colour and motion information", Proceedings of 11th International Conference on Image Analysis and Processing ICIAP (2001) [4] H. Veeraraghavan, O. Masoud, N. Papanikolopoulos: "Computer vision algorithms for intersection monitoring", IEEE Trans. Intell. Transport. Syst. 4 (2003) pp. 78-89. [5] Y.Jung, K. Lee, Y. Ho, "Content-Based event retrieval using semantic scene interpretation for automated traffic surveillance", IEEE Trans. Intell. Transport. Syst. 2 (2001) pp. 151-163. [6] E. Memin and P. Perez, "Dense estimation and object-based segmentation of the optical flow with robust techniques," IEEE Trans. Image Process. VOL. 7, NO. 5 , May 1998 pp. 703-719. [7] M. Chang, A. Tekalp, and M. Sezan, "Simultaneous motion estimation and segmentation," IEEE Trans.Image Process. VOL. 6, Sept. 1997 pp. 1326-1333. [8] J. Wang and E. Adelson, "Representing moving images with layers," IEEE Trans. Image Process. VOL.3, NO.5, Sept. 1994 pp. 625-638. [9] H. Sawhney and S. Ayer, "Compact representations of videos through dominant and multiple motion estimation," IEEE Trans. Pattern Anal. Machine Intell. VOL.18, Aug. 1996, pp. 814-830. [10] M.Ristivojevi´c and J. Konrad , "Joint space-time motion-based video segmentation and occlusion detection using multiphase level sets" , IS&T/SPIE Symposium on Electronic Imaging, Visual Communications and Image Processing Jan. 18–22, 2004, San Jose, CA, USA [11] M. Kass, A. Witkin, and D. Terzopoulos, "Snakes: Active contour models," International Journal of Computer Vision. VOL. 1, NO. 4, January 1988, pp. 321-331. [12] J. Sethian, Level Set Methods, Cambridge University Press, 1996. [13] A.-R. Mansouri and J. Konrad, "Motion segmentation with level sets," in Proc. IEEE Int. Conf. Image Processing, VOL.2, Oct. 1999, pp. 126-130. [14] S. Jehan-Besson, M. Barlaud, and G. Aubert, "Detection and tracking of moving objects using a new level set based method," in Proc. Int. Conf. Patt. Recog., pp. 1112{1117, Sept. 2000. [15] E. Debreuve, M. Barlaud, G. Aubert, I. Laurette, and J. Darcourt, "Space-time segmentation using level set active contours applied to myocardial gated SPECT," IEEE Trans. Med. Imag. VOL.20, July 2001, pp. 643-659. [16] F. Luthon, A. Caplier, and M. Li_evin, "Spatiotemporal MRF approach with application to motion detection and lip segmentation in video sequences," Signal Process. VOL.76, 1999, pp. 61-80. [17] F.-M. Porikli and Y. Wang, "An unsupervised multiresolution object extraction algorithm using videocube," in Proc. IEEE Int. Conf. Image Processing, 2001, pp. 359-362. [18] R. El-Feghali, A. Mitiche, and A.-R. Mansouri, "Tracking as motion boundary detection in spatio-temporal space," in Int. Conf. Imaging Science, Systems, and Technology, June 2001, pp. 600-604. [19] A. Mitiche, R. El-Feghali, and A.-R. Mansouri, "Tracking moving objects as spatio-temporal boundary detection," in Proc. IEEE Southwest Symp. on Image Anal. Interp., Apr. 2002, pp. 206-110. [20] A.-R. Mansouri, A. Mitiche, and R. El-Feghali, "Spatio-temporal motion segmentation via level set partial differential equations," in Proc. IEEE Southwest Symp. on Image Anal. Interp., Apr. 2002, pp. 243-247. [21] A.-R. Mansouri and A. Mitiche, "Spatial/joint space-time motion segmentation of image sequences by level set pursuit," in Proc. IEEE Int. Conf. Image Processing, VOL.2, Sept. 2002, pp. 265-268. [22] J. Konrad and M. Ristivojevic, "Joint space-time image sequence segmentation based on volume competition and level sets," in Proc. IEEE Int. Conf. Image Processing, VOL.1, Oct. 2002, pp. 573-576. [23] S. Pundlik and S. Zadgaonkar , "Vehicle boundary detection and tracking" , Dept. of Electrical and Computer Engineering, Clemson University, Dec. 2004 [24] E. Nowak and F. Jurie, "Vehicle categorization: Parts for speed and accuracy" , Laboratoire GRAVIR / UMR 5527 du CNRS - INRIA Rhone-Alpes - UJF – INPG , Societe Bertin - Technologies, Aix-enProvence , 2005 [25] J.Melo, A. Naftel, A. Bernardino and J. Santos-Victor, "Viewpoint independent detection of vehicle trajectories and lane geometry from uncalibrated Traffic Surveillance Cameras" , ICIARInt'l Conference on Image Analysis and Recognition , Porto ,Portugal , Sep.29-Oct.1 2004 [26] A. Huertas, R. Nevatia, "Detecting buildings in aerial images", Comput. Vis. Graph. Image Process. VOL.41, 1988, pp. 31–152. [27] A. Yoneyama, C.H. Yeh, C.-C.J. Kuo, "Moving cast shadow elimination for robust vehicle extraction based on 2d joint vehicle/shadow models", Proc. IEEE Conf. on Advanced Video and Signal Based Surveillance (AVSS 2003), July 2003, Miami, FL, USA. [28] J.M. Scanlan, D.M. Chabries, R. Christiansen, "A shadow detection and removal algorithm for 2-d images", in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 1990, pp. 2057–2060. [29] M. Adjouadj, "Image analysis of shadows, depressions, and upright objects in the interpretation of real world scenes", in Proc. IEEE Int. Conf. on Pattern Recognition (ICPR), 1986, pp. 834– 838. [30] G.S.K. Fung, N.H.C. Yung, G.K.H. Pang, A.H.S. Lai, "Effective moving cast shadows detection for monocular color image sequences", in Proc. 11th Int. Conf. on Image Analysis and Processing (ICIAP), 2001, pp. 404–409. [31] S. Nadimi, B. Bhanu, "Moving shadow detection using a physicsbased approach", in Proc. IEEE Int. Conf. Pattern Recognition, vol. 2, 2002, pp. 701–704. [32] R. Gershon, A.D. Jepson, J.K. Tsotsos, "Ambient illumination and the determination of material changes", Journal of the Optical Society of America A, VOL.3, NO.10, October 1986, pp. 1700-1707. [33] T. Gevers, H. Stokman, "Classifying color edges in video into shadow-geometry, highlight, or material transitions", IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 5, NO. 2, JUNE 2003, pp. 237-243. [34] E. Salvador, A. Cavallaro and T. Ebrahimi, " Cast shadow segmentation using invariant color features ", Computer Vision and Image Understanding, VOL.95, NO.2, August. 2004, pp. 238–259. [35] S.A. Shafer, "Using color to separate reflection components", Color Res. Appl. VOL.10, No.4, 1985, pp. 210–218. [36] D.A. Forsyth, J. Ponce, "Computer Vision: A Modern Approach", Prentice Hall, New York, 2003. [37] A. Cavallaro, E. Salvador and T. Ebrahimi, " Shadow-aware object-based video processing" , IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 4, August 2005, pp. 398-406. 527 Authorized licensed use limited to: SIMON FRASER UNIVERSITY. Downloaded on March 17,2010 at 14:03:01 EDT from IEEE Xplore. Restrictions apply.