Mata kuliah : T0283 - Computer Vision Tahun : 2010 Lecture 09 Pattern Recognition and Classification I Learning Objectives After carefully listening this lecture, students will be able to do the following : explain some of object recognition and classification procedures in Computer Vision such as k-NN, Neural Net and SVM demonstrate the use of feature vectors and boundary decision function in classification process January 20, 2010 T0283 - Computer Vision 3 Pattern Recognition and Classification The goal of pattern recognition is to classify objects of interest into one of a number of categories or classes Uses : Many systems must recognize things and make decisions automatically Every-day applications : Speech recognition, Fingerprint identification, DNA sequence identification, OCR (Optical character recognition) Medical applications : diagnosis of symptoms, etc Locating structures in images, detection of abnormalities, etc January 20, 2010 T0283 - Computer Vision 4 Classifiers for Recognition Examine each window of an image Classify object class within each window based on a training set images Approach : Supervised Learning Off-Line Process Feature extraction Discriminative features Invariant features with respect to translation, rotation and scaling Classifier Partitions feature space into different regions Learning Use features to determine classifier. Many different procedures for training classifiers and choosing models January 20, 2010 T0283 - Computer Vision 6 An Example: A Classification Problem Categorize images of fish—say, “Sea Bass” vs. “Salmon” Use features such as length, width, lightness, fin shape & number, mouth position, etc. Steps 1. Preprocessing (e.g., background subtraction) 2. Feature extraction 3. Classification January 20, 2010 T0283 - Computer Vision example from Duda & Hart 7 Problem Analysis Set up a camera and take some sample images to extract features Length Lightness Width Number and shape of fins Position of the mouth, etc… This is the set of all suggested features to explore for use in our classifier ! January 20, 2010 T0283 - Computer Vision 8 Pre-Processing Use a segmentation operation to isolate fishes from one another and from the background Information from a single fish is sent to a feature extractor whose purpose is to reduce the data by measuring certain features The features are passed to a classifier January 20, 2010 T0283 - Computer Vision 9 Classification Select the length of the fish as a possible feature for discrimination January 20, 2010 T0283 - Computer Vision 10 Decision Boundary The length is a poor feature alone! January 20, 2010 T0283 - Computer Vision 11 Decision Boundary Select the lightness as a possible feature. January 20, 2010 T0283 - Computer Vision 12 Threshold decision boundary and cost relationship Move our decision boundary toward smaller values of lightness in order to minimize the cost (reduce the number of sea bass that are classified salmon!) Adopt the lightness and add the width of the fish Fish xT = [x1, x2] Lightness January 20, 2010 T0283 - Computer Vision Width 13 January 20, 2010 T0283 - Computer Vision 14 We might add other features that are not correlated with the ones we already have. A precaution should be taken not to reduce the performance by adding such “noisy features” Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure: January 20, 2010 T0283 - Computer Vision 15 January 20, 2010 T0283 - Computer Vision 16 However, our satisfaction is premature because the central aim of designing a classifier is to correctly classify novel input Issue of generalization! January 20, 2010 T0283 - Computer Vision 17 January 20, 2010 T0283 - Computer Vision 18 Bayes Risk Some errors may be inevitable: the minimum risk (shaded area) is called the Bayes risk Probability density functions (area under each curve sums to 1) January 20, 2010 T0283 - Computer Vision 19 Histogram based classifiers Use a histogram to represent the class-conditional densities (i.e. p(x|1), p(x|2), etc) Advantage: Estimates converge towards correct values with enough data Disadvantage: Histogram becomes big with high dimension so requires too much data but maybe we can assume feature independence? January 20, 2010 T0283 - Computer Vision 20 Application: Skin Colour Histograms Skin has a very small range of (intensity independent) colours, and little texture Compute colour measure, check if colour is in this range, check if there is little texture (median filter) Get class conditional densities (histograms), priors from data (counting) Classifier is January 20, 2010 T0283 - Computer Vision 21 Skin Colour Models Skin chrominance points January 20, 2010 Smoothed, [0,1]-normalized T0283 - Computer Vision 22 Skin Colour Classification courtesy of G. Loy January 20, 2010 T0283 - Computer Vision 23 Results Figure from “Statistical color models with application to skin detection,” M.J. Jones and J. Rehg, Proc. Computer Vision and Pattern Recognition, 1999 copyright 1999, IEEE January 20, 2010 T0283 - Computer Vision 24 Nearest Neighbor Classifier Useful non-linear classifier Retain all training set Select class of new example as that of closest vector training set Require a distance metric d(X1, X2) Common metric is Euclidean distance d(X1, X2) = |X1 – X2| January 20, 2010 T0283 - Computer Vision 25 Nearest Neighbor Classifier Assign label of nearest training data point to each test data point from Duda et al. January 20, 2010 Voronoi partitioning of feature space for 2-category 2-D and 3-D data T0283 - Computer Vision 26 K-Nearest Neighbors Rather than choose single closest Find k closest (k = odd) If kA from class A and kB from class B Choose class A if kA > kB More robust than single nearest neighbor January 20, 2010 T0283 - Computer Vision 27 K-Nearest Neighbors For a new point, find the k closest points from training data Labels of the k points “vote” to classify Avoids fixed scale choice—uses data itself (can be very important in practice) Simple method that works well if the distance measure correctly weights the various dimensions k=5 January 20, 2010 from Duda et al. Example density estimate T0283 - Computer Vision 28 Neural networks Compose layered classifiers Use a weighted sum of elements at the previous layer to compute results at next layer Apply a smooth threshold function from each layer to the next (introduces non-linearity) Initialize the network with small random weights Learn all the weights by performing gradient descent (i.e., perform small adjustments to improve results) January 20, 2010 T0283 - Computer Vision 29 Training Adjust parameters to minimize error on training set Perform gradient descent, making small changes in the direction of the derivative of error with respect to each parameter Stop when error is low, and hasn’t changed much Network itself is designed by hand to suit the problem, so only the weights are learned January 20, 2010 T0283 - Computer Vision 30 Architecture of the complete system: they use another neural net to estimate orientation of the face, then rectify it. They search over scales to find bigger/smaller faces. Figure from “Rotation invariant neural-network based face detection,” H.A. Rowley, S. Baluja and T. Kanade, Proc. Computer Vision and Pattern Recognition, 1998, copyright 1998, IEEE January 20, 2010 T0283 - Computer Vision 31 Support Vector Machines Try to obtain the decision boundary directly potentially easier, because we need to encode only the geometry of the boundary, not any irrelevant wiggles in the posterior. Not all points affect the decision boundary January 20, 2010 T0283 - Computer Vision 32 Support Vectors January 20, 2010 T0283 - Computer Vision 33 Decision Boundary : A Computation Example g ij ( X ) 0 g ij ( X ) 0 gij ( X ) gi ( X ) g j ( X ) 0 Discriminant Function : 1 T g i X ri ( ri ri ) 2 T January 20, 2010 T0283 - Computer Vision 34 Decision Boundaries Mean vectors for the three classes are given by C1 =(1,2)T, C2 =(3,6)T , and C3 =(7,2)T . January 20, 2010 T0283 - Computer Vision 35 Substituting r1 = (1,2)T in discriminant function g we get : 1 T g i ( X ) X ri (ri ri ) 2 g1 ( X ) x1 2 x2 2.5 T and r2 = (3,6)T we get : g 2 ( X ) 3 x1 6 x2 22.5 Decision boundary function g12 : g12 ( X ) g1 ( X ) g 2 ( X ) 0 g12 ( X ) x1 2 x2 10 0 January 20, 2010 T0283 - Computer Vision 36