Object Detection Using the Statistics of Parts Henry Schneiderman Takeo Kanade Presented by : Sameer Shirdhonkar December 11, 2003 Overview Main Features of Paper • Multiple Exhaustive Classifiers • Parts based representation : Discretized Wavelet Coefficients • Estimating probabilities : AdaBoost with Confidence Weighted Predictions Classifier Design • Part : Set of input features which are statistically inter-dependent, and independent of other features. • Wavelet Coefficients as Features: Linear Phase 5/3 perfect reconstruction filter bank – Invertible transform [ but not after quantization ] – Partially decorrelates natural scenes – less features needed – Parts can be localized by space, frequency and orientation – Multiresolution nature speeds up computation Classifier Form • Likelihood Ratio Test [ Used similar to SPRT ] • Generalization of Ideal Classifier Table [ Object present/absent for all possible feature values ] • Convert P(Image|Object) and P(Image|Non-Object) to P(object|mage) • Change P(Object|Image) to Classifier output (present/absent) Approximations • Parts are statistically Independent – Localized Dependence for cars, faces, etc. • Part values (Wavelet Transform coefficients) are quantized • Part positions are quantized coarsely Local Operators • Locality in position more important • Local Operator – Moving Combination of Wavelet coefficients Local Operator Design • Intra-subband operators – 6 – Joint localization in space, frequency and orientation • Inter-Orientation operators – 4 – Localization in space and frequency, different orientations • Inter-frequency operators – 6 – Localization in space and orientation, broad frequency content • Inter-Orientation + Inter-Frequency Operator – 1 – Localization in space, different frequency and orientation The Hard Part: Collecting Data • Pre-processing Object Images: – Size normalization and Spatial Alignment – Intensity Normalization and Lighting Correction – Separate normalizations for left and right parts of face (5 discrete values) – Synthesizing data : Positional perturbation, Overcomplete evaluation of wavelet transform, background substitution, low pass filtering • Non-object images : Bootstrapping Training • Probabilistic Approximation – Filling the histogram bins of Parts • AdaBoost : – Train Multiple Classifiers ht(x) with weighted training samples. – First Classifier h1(x) – equal weights to all. – Next – Higher weight to Incorrectly classified samples – Final Classifier: – αt found by binary search – The weighted sum of classifiers is reduced to a single classifier due to linearity (in log likelihood). Efficient Exhaustive Search [Does this exist ?] • Algorithm uses exhaustive search across position, size, orientation, alignment and intensity. • Course to Fine Evaluation – similar to SPRT • Wavelet Transform coefficients can be reused for multiple scales • Color preprocessing • Time – 5 s for 240x256 image (PII 450 MHz) Results : Face Detection Sometimes it Works And Sometimes it Doesn’t Results : Car Detection Discussion Which are the Important Parts ? Conclusion • Works pretty well • Training is difficult and needs too much manual intervention • Slow – due to exhaustive search How many faces in this picture ? What about this ?