Image-based pattern recognition principles Presenter : Ke-Jie Liao Advisor : Jian-Jiun Ding, Ph. D. Professor DISP Lab GICE National Taiwan University , Taipei, Taiwan,ROC 1 Outline Introduction 2D Matched Filter Image Registration Bayes Statistical Classifier Neural Networks Syntactic Recognition Face Recognition 2 Introduction Fig.1 Basic components of a pattern recognition system[8] 3 Introduction Data acquisition and sensing Pre-processing Removal of noise in data. Isolation of patterns of interest from the background. Feature extraction Finding a new representation in terms of features. (Better for further processing) 4 Introduction Model learning and estimation Classification Learning a mapping between features and pattern groups. Using learned models to assign a pattern to a predefined category Post-processing Evaluation of confidence in decisions. Exploitation of context to improve performances. 5 Table 1 : Examples of pattern recognition applications[8] 6 2D Matched Filter Functionality Degrading the noise effect. Computing the similarity of two objects. (Template matching for images) Functional block Impulse response H*(-m,-n) 2D Matched Filter Input image I(m,n) Template image H*(-m,-n) Output image Y(m,n) 7 2D Matched Filter Mathematical expression Without normalization Y(m,n)=I(m,n) ﹡H*(-m,-n) Normalization Y (m, n) I (m m1, n n1) H *(m1, n1) m1 n1 2 | H(m1,n1)| m1 n1 2 | I(m+ m1,n+ n1)| m1 n1 8 2D Matched Filter : Template Output image Matching Input image I(m,n) 2D Matched Filter Template image H(m,n) Without normalization Rotated image H*(-m,-n) With normaliztion 9 2D Matched Filter : Template Matching Drawbacks Poor discriminative ability on template shape. (Ignoring the structural relation of patterns) Changes in in rotation and magnification of template objects result in enormous number of templates testing. Template matching is usually limited to smaller local features, which are more invariant to size and shape variations of an object. 10 Image Registration What is Image Registration? Aligning images correctly to make systems have better performance. Misregistration between images Translational differences Scale differences Rotational differences 11 Image Registration : Detecting Translational Parameter Spatial domain approach -Normalized 2D matched filter The highest output value is the best translational position. Frequency domain approach -Phase correlation method 12 Image Registration : Detecting Translational Parameter Phase correlation method F2(x, y) = F1(xx0, yy0) F.T. F2(wx, wy) = F1(wx, wy) exp{i(wxx0 wyy0)} Cross-power spectrum F 1( wx, wy )F 2 *( wx, wy ) G( wx, wy ) exp{i ( wxx 0 wyy 0)} | F 1( wx, wy )F 2( wx, wy ) | I.F.T. G ( x , y ) ( x x 0, y y 0 ) 13 Image Registration : Detecting Scale and Rotational Parameter Detecting rotational parameter F 2( x, y ) F 1( x cos 0 y sin 0 x 0, x sin 0 y cos 0 y 0) F.T. F 2( wx, wy ) F 1( wx cos 0 wy sin 0, wx sin 0 wy cos 0) exp{i( wxx 0 wyy 0)} Taking magnitudes both sides Representing in polar form M 2( , ) M 1( , 0) 14 Image Registration : Detecting Scale and Rotational Parameter Detecting scale parameter F2(x, y) = F1(ax, by) F.T. 1 wx wy F 2( wx, wy ) F 1( , ) | ab | a b Frequency variables to a logarithmic scale 1 F 2(log wx,log wy ) F 1(log wx log a,log wy log b) | ab | 15 Bayes Statistical Classifiers Consideration Randomness of patterns Decision criterion Pattern x is labeled as class wi if W W L p(x / w )P(w ) < L p(x / w )P(w ) ki k=1 k k qj q q q=1 Lij : Misclassification loss function p(x/wi) : P.d.f. of a particular pattern x comes from class wi P(wi) : Probability of occurrence of class wi 16 Bayes Statistical Classifiers Decision criterion : Given Lij is symmetrical function Posterior probability decision rule p(x / wi)P(wi) > p(x / wj)P(wj) dj(x)= p(x / wj)P(wj)= P(wj / x) dj(x) : decision functions Pattern x classifies to class j if dj(x) yields the largest value 17 Bayes Statistical Classifiers Advantages Optimization in minimizing the total avarage loss in miscalssification. Disadvantages Both P(wj) and p(x/wj) must be known in advance. Estimation is required. Performance highly depends on the assumption of the distributions.( P(wj) and p(x/wj) ) 18 Neural Networks What is Neural Networks? Ideas stem from the operation of human neural networks. Networks of interconnected nonlinear computing elements called nurons. 19 Neural Networks Perceptron : two classes model Fig.2 Structure of perceptron 20 Neural Networks : Multilayer Feedforward Neural Networks Basic structure Fig.3 Structure of multilayer feedforward neural networks 21 Neural Networks : Multilayer Feedforward Neural Networks Training algorithm : back propagation Sigmoid activation function 1 hj(Ij)= Nk -( 1+e wjkOk+ j)/ 0 k=1 Nk Ij = wjkOk k=1 Ok hk ( Ik ) Fig.4 Blowup of a neuron[1] 22 Neural Networks : Multilayer Feedforward Neural Networks 1. Initialization Assigning an arbitrary set of weights throughout the network (not equally). 2. Iterative step a. Computing Oj for each node by using training vector, then generating the error terms for output δq, where = (r - O )h ' ( I ) , rq is the desired response. b. Backward passing appropriate error signal is passed to each node and the corresponding weight changes are made. q q q q q 23 Neural Networks Decision surface complexity Table2 : Decision surface complexity of multilayer feedforward neural networks[1] 24 Syntactic Recognition Concerning the structural relation. Patterns represent in combinations of primitives. Fig.5 Conceptual diagram of syntactic recognition 25 Syntactic Recognition : String Case Input to the automata are unknown sentences generated by the corresponding grammars respectively. The grammar G = (N, Σ, P, S) N is a finite set of variables called nonterminals, Σ is a finite set of constants called terminals, P is a set of rewriting rules called productions, and S in N is called the starting symbol. 26 Syntactic Recognition : String Case An example N={A,B,S},Σ={a,b,c} P={S→ aA, A→ bA, A→ bB, B→C} S→ aA→ abA→ abbA→ …. →abbbbbc L(G)={abnc|n≧1} Fig.6 An example of string language[1] 27 Syntactic Recognition : String Case The finite automata Af = (Q, Σ, δ, q0, F) Q is a finite, nonempty set of states, Σ is a finite input alphabet, δ is a mapping from Q×Σ into the collection of all subsets of Q, q0 is the starting state, and F is a set of final states. 28 Syntactic Recognition : String Case A simple automaton Fig.7 State machine of the automaton[1] Af = (Q, Σ, δ, q0, F) Q={q0,q1,q2} Σ={a, b} F=q0 δ(q0,a)={q2} δ(q0,b)={q1} δ(q1,a)={q2} δ(q1,b)={q0} δ(q2,a)={q0} δ(q2,b)={q1} Invalid input string : bababbb Valid input string : aaabbbb 29 Syntactic Recognition : String Case Conversion between regular grammar and corresponding automaton states. G = (N, Σ, P, S) Af = (Q, Σ, δ, q0, F) X0≣S Q={q0,q1,….,qn,qn+1} N={X0~Xn} The mappings in δ are obtained by using the following two rules, for a in Σ ,and each i and j ,with 0≦i≦n, 0≦j≦n, 1.If Xi→aXj is in P, then δ( qi, a) contains qj. 2.If Xi→a is in P, then δ( qi, a) contains qn+1. 30 Syntactic Recognition : String Case Grammars are not known in advance, we need to learn the automata from sample patterns. An unknown grammar G and a finite sets of samples R+ h( z, R+, k) = {w| zw in R+ ,|w|≦k} , z belongs to Σ* Q = {q|q = h( z, R+, k) for z in Σ*} δ( q, a) = {q’ in Q|q’ = h(za, R+, k), with q = h( z, R+, k)} q0 = h( λ, R+, k) F = {q| q in Q, λ in q} 31 Syntactic Recognition : String Case An example of learning automaton structure from a given sample set R+={a,ab,abb}.( k=1) Determining h( z, R+, k) h(λ,R+,1)={w|λw in R+,|w|≦1} ={a} =q0 z= a h(a ,R+,1)={w| aw in R+,|w|≦1} ={λ,b} =q1 z= ab h(ab ,R+,1)={w| abw in R+,|w|≦1} ={λ,b} =q1 z= abb h(abb ,R+,1)={w| abbw in R+,|w|≦1} ={λ} =q2 z=λ λ is a empty string set 32 Syntactic Recognition : String Case Obtaining mapping function Q={q0,q1,q2,q3} ,q3 denotes empty set state h(λ,R+,1) =q0 , z=λ δ(q0,a)= h(λa,R+,1)= h(a,R+,1)=q1 δ(q0,b)= h(λb,R+,1)= h(b,R+,1)=q3 h(a,R+,1)= h(ab,R+,1) =q1 δ(q1,a)= h(aa,R+,1)= h(aba,R+,1)=q3 δ(q1,b) ⊇ h(ab,R+,1)=q1 δ(q1,b) ⊇ h(abb,R+,1)=q2 δ(q1,b) ={q1,q2} δ(q2,a) =δ(q2,b) =δ(q3,a) =δ(q3,b) =q3 Obtaining final state F q1={λ, b} q2={λ} F={q1, q2} 33 Syntactic Recognition : String Case State diagram for the finite automaton inferred from the sample set R+ b q0 a b q1 b a q2 a,b q3 R+={a,ab,abb} a,b 34 Syntactic Recognition : String Case Fig.8 Graphic relation between k and L[Af(R+, k + 1 )] 35 Face Recognition User-friendly pattern recognition application Weakness of face recognition Illumination problems Pose problems( profile or frontal view) Fig.9 Examples of illumination problems[9] 36 Face Recognition : EigenspaceBased Approach Eigenspace-based approach A holistic approach Reducing the high dimensionality problem , and large computational complexity. x …. Vectorization 200180 x 360001 .. .. A face image of size 200×180 37 Face Recognition : Standard Eigenspace-Based Approach Standard Eigenspace-based approach Given a set of training face images, computing the eigenvectors of the distribution of face images within the entire image space .(PCA Size of N2×N2 method) 1 M M n1 Length of N2 1 M C F F AAT Size of N2×M M n1 T n n n Fi i A [F1, F 2,...., FM ] Гn : face vectors Ψ : Mean vector C : Covariance matrix of training set M : Number of training face images 38 Face Recognition : Standard Eigenspace-Based Approach C is too big! We can reduce the eigenvlue value problems from order of N2×N2 to M×M using the following analysis. A Avi ivi T AA Avi C ( Avi ) i ( Avi ) T vi : eigenvectors of ATA μi : eigenvalues of ATA and C Avi : eigenvectors of C 39 Face Recognition : Standard Eigenspace-Based Approach Fig.11 Mean face Fig.10 Training set 40 Face Recognition : Standard Eigenspace-Based Approach Fig.12 Eigenfaces 41 Face Recognition : Standard Eigenspace-Based Approach Operation of S.E. approach A new face image Γ wk uk ( ) T [ w1, w2,..., wM '] T uk : eigenvectors M’ : Number of reduced eigenvectors Comparing two Euclidian distances ( k ) 2 (F F f ) 2 2 k 2 Ωk : Prototype weight vector of each class k 42 Face Recognition : Standard Eigenspace-Based Approach Four possibilities <1>Near face space and near a face class. <2>Near face space but not near a known face class. <3>Distant from face space and near a face class. <4>Distant from face space and not near a known face class. Fig.13 Geometric relationship betweenεk and ε. 43 Face Recognition : FLD EigenspaceBased Approach FLD(Fisher Linear Discriminant) Another method that searches for the projection axes. Face images of different classes are far from each other (similar to PCA), and at the same time where the images of a same class are close from each other. 44 Face Recognition : FLD EigenspaceBased Approach Fig.14 Comparison between PCA and FLD[9] 45 Face Recognition : FLD EigenspaceBased Approach Mathematical Expression Selecting projection unitary vector u s.t. (u) to be maximized uT Sbu (u ) T u Swu NC Sb P(Ci )(m(i ) m)(m(i ) m)T i 1 NC Sw P (Ci ) E[( x(i ) m(i ))( x(i ) m(i ))T ] i 1 Sb : Measuring the separation between the individual class means respect to the global mean face Sw : Measuring the separation between vectors of each class respect to their own class mean Using Lagrange multiplier and set uTSwu=1 be the constraint condition Sbwk kSwwk : Generalized eigenvalue problem 46 Conclusions Template matching is simple to implement but the template size must be small to decrease computational delay. Statistical methods highly depends on the assumption of distribution. Neural networks can adaptively refine the classifier and the decision surface in principle can be arbitrarily implemented . Syntactic methods concerned structural sense to encode but additional process to define primitives are required. 47 Future Works Frequency domain or Wavelet domain Image compression method to face recognition Video-based face recognition Adding color factor into face recognition 48 References [1] R.C.Gonzalez,R.E.Woods,”Digital Image Processing(Second Edition”,Prentice-Hall,Inc,2002,pp.693-750. [2] W. K. Pratt,”Digital Image Processing : PIKS Inside(Third Edition)”, John Wiley & Sons,Inc,2001,pp.613-637. [3] M. Turk and A. Pentland,”Eigenfaces for Recognition”, J. Cogn. Neurosci.,vol. 3, no. 1, pp. 71–86, 1991. [4] Kah-Kay Sung and T. Poggio,” Example-Based Learning for View-Based Human Face Detection”, IEEE Trans. on Pattern Analysis and Machine Iintelligence, Vol. 20, No. 1,pp. 39-51 January 1998. 49 References [5] W. Zhao, R. Chellappa, A. Rosenfeld, P.J. Phillips, ”Face Recognition: A Literature Survey”, ACM Computing Surveys, 2003, pp. 399-458. [6] J. Ruiz-del-Solar, P. Navarrete, ” Eigenspace-Based Face Recognition: A Comparative Study of Different Approaches”, IEEE Trans. on Systems, Man and Cybernetics—Part C: Applications and Reviews, Vol. 35, No. 3, pp. 315-325 August 2005 [7] P. Sinha, B. Balas, Y. Ostrovsky, and R. Russell, ”Face Recognition by Humans: Nineteen Results All Computer Vision Researchers Should Know About”, Proceedings of the IEEE , Vol. 94, No. 11,pp. 1948-1962 November 2006. 50 References [8]R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, John Wiley & Sons, 2001. [9]P.N.Belhumeur,J.P.Hespanha,D.J.Kriegman,Eigenfaces vs. Fisherfaces : Recognition Using Class Specific Linear Projection. 51