9.913 Pattern Recognition for Vision Class 8-2 – An Application of Clustering Bernd Heisele Fall 2003 Overview •Problem •Background •Clustering for Tracking •Examples •Literature •Homework Fall 2002 Pattern Recognition for Vision Problem Detect objects on the road: Cars, trucks, motorbikes, pedestrians. Fall 2002 Pattern Recognition for Vision Image Motion Determine the image motion (vector field) v ( x, y ) = ( u ( x, y ), v( x, y ) ) . T Fall 2002 Pattern Recognition for Vision Object Segmentation using Image Motion Motion-based segmentation Fall 2002 Pattern Recognition for Vision Image Motion—Equations for Rigid Motion u = −Ω X xy + ΩY (1 + x 2 ) − ΩZ y + (VX − VZ x ) / Z v = −Ω X (1 + y 2 ) + ΩY xy + ΩZ x + (VY − VZ y ) / Z Fall 2002 Pattern Recognition for Vision Image Motion—Estimation Optical Flow Image intensity over time is f ( x, y, t ) The intensity of a point over time is given by: g (t ) = f ( x(t ), y (t ), t ), where x(t ), y (t ) is the trajectory of the point in the image plane. Assume the intensity of the point does not change over time: dg (t ) ∂f ∂x ∂f ∂y ∂f =0⇒ + + =0 ∂x ∂t ∂y ∂t ∂t dt ∂x ∂y , = (u , v), (u , v)∇f + ft = 0 ∂t ∂t Fall 2002 Pattern Recognition for Vision Image Motion—Estimation Optical Flow Gradient equation of optical flow: ∂f ∂f ∂f u+ v+ =0 ∂x ∂y ∂t ∆f ( x) ∆f (t ) =− , u ∆t ∆x f u ∆f ( x ) ∆f (t ) u=− ∆x ∆t ∆f (t ) ∆f ( x) ∆x ∆x′ = u ∆t x Fall 2002 Pattern Recognition for Vision Image Motion—Estimation problems Aperture Problem t2 Aperture t1 t1 y Aperture x t2 y x No change over time, optical flow=? Fall 2002 Pattern Recognition for Vision Object Segmentation Problem Objects: To determine the image motion we have to know what the objects are, i.e. which points belong together. However, we can’t extract objects without motion. Fall 2002 Pattern Recognition for Vision An Idea Are there methods that help us to determine the image motion (avoid the aperture problem)? Neighbored pixels of similar color belong to the same object ⇒ color segmentation. Objects usually consist of several regions of different color ⇒ color segmentation alone does not solve the problem, we still need motion. Fall 2002 Pattern Recognition for Vision Color Segmentation & Motion Color segmentation of single images Find regions (connected component Analysis) Fall 2002 Motion-based segmentation: Objects Track regions Pattern Recognition for Vision Color Segmentation & Motion, why it did not work Color regions are not stable over time, they merge & break apart which makes tracking extremely difficult. Need consistent color segmentation over time! Fall 2002 Pattern Recognition for Vision Color Cluster Flow Cluster Centers Segmented Objects Motion Segmentation Clustering by parallel K-means Image n Trajectories of Cluster Centers Each pixel is a data point in (R, G, B, x, y) Fall 2002 Pattern Recognition for Vision Color/Position Clustering Original Image Clustered Image Boundaries of Clusters Each pixel is represented by a its color/positoin features (R, G, B, wx, wy), where w is a constant. Clustering is applied to group pixels with similar color and position. Fall 2002 Pattern Recognition for Vision Color versus Position Cluserting in (R, G, B, wx, wy) For large w the clusters are compact (Voronoi tesselation). For small w the color dominates and the clusters lose their spatial connectivity. Fall 2002 Pattern Recognition for Vision Clustering of Consecutive Images with k-means y We know the cluster centers of the previous image and use them for clustering the current image ⇒ consistent segmentation over time Assign each data point to closest center Recompute center as average of data points The trajectories (displacemets of cluster centers) are for free. x Fall 2002 Pattern Recognition for Vision Parallel k-means Clustering 1. Partitioning step in iteration k : { 2 } 2 Cq (k ) = s n | s n − rq (k − 1) ≤ s n − ri (k − 1) ∀i ≠ q Cq : cluster q, s n : data point n, rq : prototype of cluster q 2. Computing prototypes in iteration k : 1 rq (k ) = si ∑ S q (k ) si ∈Cq ( k ) S q : data points in cluster q k -means leads to a local minimum of the quantization error 1 mse = N Fall 2002 N ∑s i =1 2 i − rq (i ) , rq (i ) = arg min si − rn 2 1≤ n ≤Q Pattern Recognition for Vision Fast parallel k-means Clustering Q − 1 circles around each rm with diameters d n = rm − rn check if si are inside the circles. 3 × Speed-up for one iteration Fall 2002 Pattern Recognition for Vision Initial Clustering Clustering of the first image of a sequence K-means is not a good choice for the first image because we don’t know a good initialization of the cluster centers. • many iterations required (slow) • could lead to a ‘bad’ local minimum (large mse) We need a fast algorithm which doesn’t require initialization. Divisive clustering (tree-based clustering) Fall 2002 Pattern Recognition for Vision Initial Clustering No prior knowledge about color clusters. y Select cluster with highest variance Split by hyperplane vertical to direction of highest variance. x Fall 2002 Pattern Recognition for Vision Initial Clustering Binary Tree Root cluster contains all points. -Use mse bound or max. number of clusters as stop criteria. -The cluster centers of the leafs initialize k-means of the next image. Fall 2002 Pattern Recognition for Vision Examples First Image of Sequence Fall 2002 Last Image of Sequence Trajectories of Cluster Centroids Pattern Recognition for Vision Not Perfect Clusters ‘jump’ when objects appear or leave the scene. (Number of clusters if fixed) Over-segmentation and spurious motion in homogenous regions Fall 2002 Pattern Recognition for Vision Literature B. Heisele, U. Kressel, and W. Ritter. Tracking non-rigid, moving objects based on color cluster flow. Proc. Computer Vision and Pattern Recognition (CVPR), pp. 253-257, San Juan, 1997. Clustering Classics: J. MacQueen. Some methods for classification and analysis of multivariate observations. Proc. 5th Berkeley Symp. Mathematics, Statistics and Probablility, pp. 281-297, 1967. Y. Linde, A. Buzo, and R. Gray. An algorithm for vector quantizer design. IEEE Transactions on Communications, COM-28/1, pp. 84-95, 1980. Fall 2002 Pattern Recognition for Vision Homework No homework! Fall 2002 Pattern Recognition for Vision