Color Classification Using Adaptive Dichromatic Model Proceedings of the 2006 IEEE International Conference on Robotics and Automation Orlando, Florida May 2006 Student ID : M9920103 Student : Kun-Hong Lee Teacher : Ming-Yuan Shieh Outline • ABSTRACT • INTRODUCTION • DESCRIPTION OF THE METHOD – DICHROMATIC REFLECTANCE MODEL – COLOR CLASSIFICATION AND ADAPTATION – ALGORITHM IMPLEMENTATION • EXPERIMENTAL RESULTS • CONCLUSION AND FUTURE WORK • REFERENCES ABSTRACT • Color-based vision applications face the challenge that colors are variant to illumination. In this paper we present a color classification algorithm that is adaptive to continuous variable lighting. • Motivated by the dichromatic color reflectance model, we use a Gaussian Mixture Model (GMM) of two components to model the distribution of a color class in the YUV color space. • The GMM is derived from the classified color pixels using the standard Expectation-Maximization (EM) algorithm, and the color model is iteratively updated over time. • The novel contribution of this work is the theoretical analysis – supported by experiments – that a GMM of two components is an accurate and complete representation of the color distribution of a dichromatic surface. INTRODUCTION • Color has been used as an important feature in machine vision applications because of its advantages over other image features. However, color faces the inevitable problem of being affected by illumination. • To address this concern, adaptive color modeling has attracted intense research interests, and its use in color tracking applications shows promising results [14], [13], [8]. In this approach, color is generally represented by either a non-parametric model or a parametric model. A commonly used non-parametric model is color histogram [11], [8], and it is appropriate in problems with a large data set and coarsely quantized color space [7]. Gaussian is a commonly used parametric color model in many color-based computer vision applications such as for face and skin tracking or detection [12], [15]. Generally,a single Gaussian model is applicable to objects with an uniform color surface of approximately Lambertian property and negligible specular reflection. When an object surface does not satisfy these conditions, a Gaussian mixture can be used instead [14]. Previous works using GMM however focused on the model order selection problem, i.e., how to choose the number of parameters of the GMM in order to accurately model the underlying distribution. In this paper we present a real-time algorithm for color classification using adaptive color modeling. We use Gaussian mixture model (GMM) of two components in the YUV space to model the distribution of a color class, according to the reflectance property of dichromatic materials. DESCRIPTION OF THE METHOD DICHROMATIC REFLECTANCE MODEL • Most previous works in adaptive color modeling are based on the assumption that the object surfaces are Lambertian. • Therefore, the underlying probability distribution of the color model can be reasonably represented by a single Gaussian. This is a rather simplified model of the material’s reflectance property as in practice many materials violates this assumption. The dichromatic reflectance model describes the reflected light from a dielectric material surface L(λ, i, e, g), as the mixture of two portions, the surface reflection Ls(λ, i, e, g) and the body reflection Lb(λ, i, e, g). L(λ, i, e, g) = Ls(λ, i, e, g) + Lb(λ, i, e, g) (1) where i, e, and g describe the angles of the incident and emitted light and phase angle, and λ is the wavelength. • If we assume there is only a single light source, and there is no inter-reflection between the objects in the scene, we can separate the surface reflectance component Ls(λ, i, e, g) as the product of a specular power distribution cs(λ) and a scalar ms(i, e, g). • Also we can separate the body reflectance component Lb(λ, i, e, g) as the product of a specular power distribution cb(λ) and a scalar mb(i, e, g). Equation (1) can be written as the dichromatic reflection model equation as below: L(λ, i, e, g) = ms(i, e, g)cs(λ) + mb(i, e, g)cb(λ) (2) • Thus the light reflected from an object point can be described as a mixture of two distinct spectral power distributions, cs(λ) and cb(λ). Each of these spectral power distributions is scaled according to the geometric reflection property of the body and surface reflection. • The body part models conventional matte surfaces, which exhibits the Lambertian characteristic. • The surface part models highlights, which have the same spectral power distribution as the illuminant. Klinker et al. show that for convex shaped objects with dichromatic reflectance, the distribution of RGBs maps out to a T shape, with the bar of the T corresponding to body reflectance, and the stem of the T corresponding to surface reflectance [4]. This is consistent with our experimental data shown in Figure 1. • The dichromatic model has proved useful for a variety of computer vision tasks including color constancy [5], shape recovery [2] and segmentation [1]. DESCRIPTION OF THE METHOD COLOR CLASSIFICATION AND ADAPTATION • A. Color Modeling • In our system, we choose GMM to represent the color model based on the dichromatic reflectance model. The dichromatic reflectance model explains a color distribution as the combination of two clusters: diffuse cluster from the body reflectance and specular cluster from the surface reflectance. • Suppose we have a color class O, and the color distribution of O can be modeled as a Gaussian mixture model of two Fig. 1. The color distribution of a blue color patch in RGB (left column) and UV space (right column). From top to bottom this color patch is located at different positions on the field, experiencing spatial lighting variation. Gaussian components. Given a pixel ξ, the conditional density for ξ to belong to O is where P(j) is the prior probability that pixel ξ is generated by component j, and P(j) = 1. If we assume that each Gaussian component has a mean μj and a covariance matrix Σj , B. Color Space Our system uses YUV color space for color classification. YUV space is commonly used in a computer vision system to deal with mild lighting variation. Y channel, which represents luminance, captures most of the variations due to the luminance change of the environment, and UV plane which represents chromoticity is insensitive to the luminance change. Our initial empirical data suggested that a color classifier defined on UV plane is capable of accurately classifying colors despite the mild illumination variation on the field under a stable lighting condition. Figure 1 shows the color distribution of a blue color patch in RGB and UV space respectively, after being positioned at different locations on the field, while the room lighting is kept the same. • Notice that in RGB space, the color distribution has a wide range of evolution because of the illumination variation at different parts of the field, while the color distribution in UV space is comparatively stable. • Although a color classifier defined on UV plane can be reasonably accurate under mild lighting variations, and most of all it shows robustness against illumination intensity change,it is not suitable for more general situations, especially when the specular component starts to be a significant portion of the color distribution. • In this case, a classifier defined on UV plane will over- classify similar background colors. As we can see from the Figure 7, the specular component does not evolve along the intensity channel. • Therefore the projection of this component on UV plane is an over-generalized classifier. For this situation, a classifier on YUV space would be more accurate. Fig. 7. The evolution of the distribution of green color in RGB space (left column) and YUV space (right column). From top to bottom, illumination intensity decreases by changing the room lighting from light to dim. The red dots and the green dots represent the pixels belonging to one of the two Gaussians in GMM. • C. Color Adaptation • The adaptation scheme used in our system is a simple exponentially decaying function. Once a Robocup soccer game starts, color blobs are located in each image frame, using the current color models. • After the noisy blobs are filtered out by applying some spatial constraints, the selected blobs are used as labeled samples to measure the GMM at the current time step. Suppose at time step , a number of sample pixels { ,i =1...n} are collected from the image frame, and a measured GMM, , is built for this color using these pixels. Also we have a current color model estimated at time step k − 1, as Now we need to use the previously estimated model and the currently measured model to derive the currently estimated model . Practically for the sake of space efficiency, we approximate a 3D GMM by its three 2D projections, and convert each 2D GMM to a probability distribution table in the corresponding projection plane; i.e., for each color we have three probability tables:PUV , PYU, and PYV . Each of these tables is of dimension 256 × 256, and the entry contains the probability value of the color being evaluated by the GMM. Suppose we have a decay factor of α in [0, 1], the indexes of the probability table are i = 1...256, j = 1...256, our exponentially decaying mechanism will update each entry of the probability tables as: DESCRIPTION OF THE METHOD ALGORITHM IMPLEMENTATION Our adaptive color classification algorithm is implemented in the color vision system for a RoboCup team at our university. The vision system uses an IEEE 1394 (Firewire) Dragonfly camera and a normal Linux PC. The camera is made by Point Grey Research Inc., and it features a single 1/3” progressive scan CCD. The PC has an off-the-shelf Firewire capture card, and AMD Athlon XP 1700+ processor with 512 MB of RAM.We use the camera to sample 640×480 24 bit RGB color images at 30 fps into the PC. The system works in two steps. The first step is offline color calibration, and the second step is on-line adaptive color classification. Both are described in detail in this section. A. Off-line Color Calibration The offline color calibration step, shown in Figure 2, extracts pixels of a target color class from a displayed image of a live video, and uses these sample pixels to build a color model which incorporates spatial variation of illumination on the field. This step is under the supervision of a human operator, and the finalized model is used to initialize the system. Fig. 2. Flowchart of the offline color calibration procedure used to initialize color classes of interest, each in terms of a two-component Gaussian mixture expressed in the form of a look-up table (LUT) To start the calibration step, user selects pixels of the interesting color classes (typical 3 to 4) from the color blobs in an image using an interactive tool. The locations of the selected pixels initialize a color tracker. The color tracker follows the movement of the object from which the selected pixels are sampled. The tracker selects the image region which includes the object as a tracking window, within the tracking window the selected pixels are used as seeds in a region growing process to include neighboring pixels of the same color. The pixels being selected by region growing form blobs, and the locations of the blobs are used as a cue to update the position of the tracking window in the consequent mage frame. For each image, the tracker collects all the pixels selected by region growing inside the tracking window as the sample pixels of the interesting color. As the robot moves over the field, the set of sample pixels can account for spatial variation of illumination. This data collecting process stops when the robot approximately goes through representative locations on the field, especially the places with significant different illuminations. The sample pixels collected during this step are then used to derive the GMM of two components using a standard EM algorithm. The derived GMM is used by the vision system as color model for classifying this color. Multiple color classes (yellow, green, and pink, e.g.) can be modeled in parallel with this process since each robot top typically carries at least three colors in our application. B. On-line Adaptive Color Classification When a game starts, the pre-calibrated color models are used by the vision system for color classification. Meanwhile, an adaptation procedure, shown in Figure 3, is executed to update the color models in real-time in order to adapt to dynamic lighting. Specifically, color classification is performed on all the pixels of an image using the current look-up tables (LUT). Fig. 3. Flowchart for the on-line color adaptation procedure, which uses a linear combination of the old and new color samples to update the Gaussian mixture of each color class to account for temporal variation of lighting This process labels each pixel with a color class, either one of the interesting colors or the background. Region growing is then applied using as seeds each of the pixels labeled with an interesting color class to include the neighboring pixels within a threshold color distance. The purpose of the region growing step is to reduce false negatives, which are the pixels belonging to the color class but not picked up by the classifier due to possibly spatial or temporal variation of illumination. By incorporating these false negatives into the set of measured data, we can have a more accurate measurement of the current color model in order to update the model. After region growing, the labeled pixels of each color class are connected to form a number of color blobs if they are spatially connected. Usually some of the blobs consist of false positives due to the background noise or overclassification by the inaccurate color model. The noisy blobs are filtered by applying a set of geometric constraints based on the expected height and the width of each blob, the density of the color pixels within each blob, etc. The filtered blobs then form the candidates of potential markers of the robots and the ball with which to locate and identify the robots and the ball with higher-level Robocup software. For each of the identified robots and the ball, their color blobs are used as labeled samples to derive the GMM as the measured color model at the current time step. For each blob only pixels in the center region within a certain radius are used to build GMM. This is based on our observation that the boundary pixels of each blob are often affected by the color region surrounding the blob due to camera noise. These pixels are likely to steer a color model toward a false target. The evaluation of the GMM from collected labeled pixels is done once every 20 frames. It is unnecessary to re-evaluate the GMM for each frame unless the illumination condition goes through a rapid change. In our application the rapid change of illumination rarely happens. The first benefit of re-evaluating the GMM after a time interval is efficiency. Another benefit is that since the robots are moving around over the field, a number of frames can provide a more complete set of sample pixels to capture the spatial variation of illumination. EXPERIMENTAL RESULTS A series of comparison experiments were carried out to examine the robustness of our algorithm under dynamic lighting. Lighting in the experiments varies spatially naturally as the result of uneven distribution of light fixtures in the laboratory where our RoboCup field was set up. In addition, lighting was made to vary temporally by controlling the dimmer in the laboratory. A photographic meter (Minolta Autometer VF) was used to determine illumination, which was found to range from 40 lux to 320 lux on the two extreme ends of darkest and brightest. The x-axis of all figures in this section represents the frame index that is linearly correlated with this lux number. The first experiment compares the classification accuracy between adaptive algorithm and non-adaptive algorithm under dynamic illumination. Figure 4 compares the numbers of correctly classified pixels – for two color classes, blue and pink – using adaptive algorithm and non-adaptive algorithm. It is clear that the adaptive algorithm is capable of handling illumination change through a broad range while the non-adaptive algorithm fails very quickly after illumination changes. Fig. 4. Comparison of the correctly classified pixels between adaptive color classification and non-adaptive color classification under gradually changing lighting. The x-axis represents the image frame number which is correlated with the room brightness from 70 lux to 256 lux With a total of four robots in the field of view of the camera, the number of correctly identified robots was also compared in Figure 5, for the adaptive and static color models. The non-adaptive algorithm failed quickly as soon as the room lighting became different from that under which the color models were originally calibrated. Fig. 5. Comparison of the correctly identified robots between adaptive color classification and non-adaptive color classification under gradually increasing lighting. The total number of robots is 4. The x-axis is interpreted in the sameway as Figure 4 Figure 6 shows the results of using both diffuse and specular component and using only diffuse component, applying the number of correctly identified robots as the performance metric. Classification using both the diffuse and the specular components significantly outperforms that using only diffuse component. These two components were separated based on their Y values, according to the observation that diffuse component has a lower luminance than specular component. Fig. 6. Comparison of the correctly identified robots by adaptive color classification in YUV space under gradually changing lighting between using only the diffuse component and both diffuse and specular components. CONCLUSION AND FUTURE WORK In this paper we have presented a study of the color classification problem, proposed an adaptive color classification algorithm, and implemented the color classification algorithm in a RoboCup vision system, which has been successfully used for several years in RoboCup competition. The experimental results show that our two-component GMM can handle dynamic lighting condition more reliably than a static algorithm and than single-component Gaussian color model used commonly in previous studies, demonstrating that GMM is a good approximation of the color distribution of an dichromatic surface. The algorithm can be expected to serve as an accurate model for many general color classification applications. Also the simplicity of our algorithm ensures that it can be effectively implemented in real-time. For further details of the work reported in this paper, please refer to [6]. REFERENCES [1] M.H. Brill. Image segmentation by object color: a unifying framework and connection to color constancy. J. Opt. Soc. Am. A, 7:2041–2047,1990. [2] M.S. Drew and L.L. Kontsevich. Closed-form attitude determination under spectrally varying illumination. In Computer vision and pattern recognition, pages 985–990, 1994. [3] A. Elgammal, R. Duraiswami, and L. Davis. Efficient nonparametric adaptive color modeling using fast gauss transform. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, volume 2, pages 563–570, Dec 2001. [4] G. J. Klinker, S. A. Shafer, and T. Kanade. The measurement of highlights in color images. International Journal of Computer Vision,2(1):7–32, 1988. [5] H. Lee, E. J. Breneman, and C. P. Schulte. Modelling light reflection for computer color vision. IEEE transactions on Pattern Analysis and Machine Intelligence, 4:402–409, 1990. [6] Xiaohu Lu. Adaptive colour classification for robocup with gaussian mixture model. Master’s thesis, University of Alberta, 2005. [7] S. J. McKenna, S. Jabri, Z. Duric, and A. Rosenfeld. Tracking groups of people. Computer Vision and Image Understanding, 80:42–56, 2000. [8] K. Nummiaro, E. B. Koller-Meier, and L. Van Gool. Object tracking with an adaptive color-based particle filter. Symposium for Pattern Recognition of the DAGM, pages 355–360, 2002. [9] Tominaga S. and B. A. Wandell. Standard surface-reflectance model and illuminant estimation. Journal of the Optical Society of America A, 6(4):576–584, 1989. [10] S. A. Shafer. Using color to seperate reflection components. Color Res. App., 10(4):210–218, 1985. [11] M J. Swain and D. H. Ballard. Color indexing. Int.J. Computer Vision, 7(1):11–32, 1991. [12] C. Wren, A. Azarbayejani, T. Darrel, and A. Pentland. Pfinder: Realtime tracking of the human body. IEEE Transaction on Pattern Analysis and Machine Intelligence, 1995. [13] Y. Wu and T. S. Huang. Color tracking by transductive learning. In Proc. IEEE Int’l Conf. on Comput. Vis. and Patt. Recog., pages 133–138, 2000. [14] Raja Y, McKenna S J, and Gong S. Colour model selection and adaptation in dynamic scenes. European Conference of Computer Vision (ECCV), pages 460–474, June 1998. [15] J. Yang, W. Lu, and A. Waibel. Skin-color modelling and adaptation. In Proc. of Asian Conference on Computer Vision, pages 687–694, 1998.