Object Recognition in the Dynamic Link Architecture Yang Ran CMPS 828J Outline Background and Introduction System Overview General algorithm in details Implementations of the algorithm Experiment results Further readings and conclusion 2016/6/27 2 Background 1. Problem: To recognize human faces from single images our of a large gallery. 2. Challenges: Distortions in terms of position, size , expression, and pose 3. Existed methods: Appearance Based v.s. Shape based 2D vs. 3D 2016/6/27 3 Background: Notations 1. 2. 3. 4. 2016/6/27 Image: face image Model: face gallery Graph: a concise face description Jet: A local description of the distribution based on the Gabor transform 4 System Overview 1. Faces are represented as rectangular graphs by layers of neurons 2. Each neuron represents a node and has a jet attached 2016/6/27 5 Assumptions The image domain and the model domain are bi-directionally connected by dynamic links. These connections are plastic on a fast time scale, changing radically during a single recognition event The strength of a connection between any two nodes in the image and a model is controlled by the jet similarity between them, which roughly corresponds to the number of features that are common to the two nodes 2016/6/27 6 Key Factors Basic representation is the labeled graph formed by edges and vertices bundled in jets 2016/6/27 Edge Labels: distance information Vertex/Node Labels: wavelet responses Graph should be able to deform to adapt to the variations of human faces 7 Preprocessing by Gabor Wavelets Gabor Wavelets are biological motivated convolution kernels in the shape of plane waves restricted by Gaussian envelope function 2016/6/27 8 More for Gabor Why use it? A good approximation to the sensitivity profiles of neurons found in visual cortex of higher vertebrates Cells come in pair with even and odd symmetry like the real and imagery part of Gabor Filter 2016/6/27 9 Jets Generation 1. The set of convolution coefficients for kernels and frequencies at one image pixel is called a jet 2. Describes a small patch of gray values around a given pixel 3. Sample W at five logarithmically spaced f levels and eight directions by u, v 2016/6/27 10 Jets Generation-cnt’l The magnitude of (WI) (kuv, x) form a feature vector located at x, which will be referred to as a jet Evaluate the similarity by Elastic Graph Matching: 2016/6/27 11 Edge Labels Derived from neuron version, edges encodes neighborhood relationships Presents the topology of the vertices Define Quadratic comparison function 2016/6/27 12 Example Graph representation of a face 2016/6/27 13 Elastic Graph Matching Elastic matching of a model graph M to a target graph I amounts to a search for a set of vertex positions which simultaneously optimizes the matching of vertex labels and edge labels according to: 2016/6/27 14 Elastic Graph Matching-cnt’l A heuristic algorism is seek to close the optimum within a reasonable time Step 1: find approximate face position so that the image can be scaled and cut to standard size Step 2: Extract graph from target face image Step 3: Match with cost function Refine position and size with λ = infinity Local distortion 2016/6/27 15 Experiments Data Base Technical Aspects Results Conclusions 2016/6/27 16 Data Base As a face data base we used galleries of 111 different persons. Of most persons there is one neutral frontal view, one frontal view of different facial expression, and two views rotated in depth by 15 and 30 degrees respectively. 2016/6/27 17 Technical Aspects The CPU time needed for the recognition of one face against a gallery of 111 models is approximately 10--15 minutes on a Sun SPARCstation 10-512 with a 50 MHz processor. 2016/6/27 18 Results-Office Items 2016/6/27 19 Comparison of Two Galleries 2016/6/27 20 More Results 2016/6/27 21 More Results-cnt’l 2016/6/27 22 Recognition Results Against Galleries Recognition results against a gallery of 20, 50, and 111 neutral frontal views 2016/6/27 23 Conclusion Close to natural model: a small number of examples is needed for face recognition Gabor Wavelets representation are robust to moderate lighting changes, shifts and deformations Elastic Graph Matching in Dynamic Link Architecture is robust in face recognition 2016/6/27 24 Conclusion 1. Having only several images per person in gallery does not provide sufficient information to handle 3D rotation 2. Rectangle grid v.s. Feature points 2016/6/27 25 References 1. 2. M. Lades, J.C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R.P. Wurtz, W. Konen. Distortion Invariant Object Recognition in the Dynamik Link Architecture. IEEE Transactions on Computers 1992, 42(3):300311. Laurenz Wiskott, Jean-Marc Fellous, Norbert Krüger, et al. Face Recognition by Elastic Bunch Graph Matching, Proc. 7th Intern. Conf. on Computer Analysis of Images and Patterns, CAIP'97, Kiel 2016/6/27 26