Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF Outline • • • • Introduction Challenges Problem Definition Weighted Region Matching (WRM) – Pre-processing steps • Human Detection • Blob Extraction • Alignment – Measuring the Distance Between Blobs – Determining the Voter’s Weight • Experiments and Results Introduction • Identity recognition from aerial platforms is a daunting task. – Highly variant features in different poses – Vanish details under low quality images • In tracking, objects are usually considered to have small displacements between observations. – Mean Shift [4] – Kalman filter-based tracking – with long temporal gaps, all assumptions of the continuous motion models become weak Challenges • Low quality images • High pose variations • Possibility of high density crowds • We employ a robust region-based appearance matching. Problem Definition • A user is able to identify a target person over a short period of time. • Humans maintained their clothing and general appearance. • We define the problem as a voter-candidate race. Weighted Region Matching (WRM) where P(vi) is the voter’s prior. Weighted Region Matching (WRM) • Equation (1) can be rewritten in a form similar to a mixture of Gaussians: • where τ is a constant parameter • Provide a robust representation of the distance between every voter-candidate pair. • Specify the weight of every voter. Human Detection • We train a SVM classifier based on the HOG descriptor [6]. • 6000 positive images: – humans at different scales and poses • 6000 negative examples: – the background and non-human objects • Train over a subset of 9000. • Validation using the rest of the dataset. Blob Extraction • The background regions contained in the bounding boxes do not provide any information about a specific person. • Segmentation method: kernel density estimator [12, 15] fˆ ( x) i K ( x xi ) i Estimate the pdf directly from the data without any assumptions about the underlying distributions. Alignment • To eliminate the variations from camera orientation and human pose. • Edge detection is noisy. • A coarse alignment: – eight point head, shoulders and torso (HST) model – The model captures the basic orientation of the upper part of the body. Alignment • Find the best fit of the HST model over human blobs – we train an Active Appearance Model (AAM) Alignment • We employ to compute an affine transformation to a desired pose. • Align all the blobs to the mean pose generated by the AAM training set. Measuring the Distance Between Blobs • Treat blob as a group of small regions of features. • These features compose: – Histograms of HSV channels – The HOG descriptor • We apply PCA on the feature space and extract the top 30 eigen vectors. Measuring the Distance Between Blobs • Using Earth Mover Distance [16, 14] (EMD) Compute the minimum cost of matching multiple regions. Having each region represented as a distribution in the feature space Measuring the Distance Between Blobs Number of pixels Number of pixels P bin Q bin • Total cost in the example : 1·1+2·2=5, EMD=5/3 • For two distributions, P = {pi} and Q = {qi} Determining the Voter’s Weight • We rank the collection of input images according to the value of information. • Given the set of regions from all voters, R = {rk} – We assign a weight for every region such that the most consistent regions are given higher weights – Use the PageRank algorithm [3] PageRank VisualRank: Applying PageRank to Large-Scale Image Search,余償鑫 • Conception – Vote – based on a random walk algorithm B PR(A) = PR(B) + PR(C) + PR(D) D A C PageRank VisualRank: Applying PageRank to Large-Scale Image Search,余償鑫 A B C D PageRank VisualRank: Applying PageRank to Large-Scale Image Search,余償鑫 In G, we connect every region from voter i to the K nearest neighbor regions of voter j where i != j. The final weight for a region rk: PR Region size the voter’s weight wi = normalized sum of weights of its regions Matching • Substituting the distances and the weights in equation 2, we compute a probability for every candidate to belong to the target. • The best match should be the candidate with the highest probability. Experiments and Results Experiments and Results Experiments and Results