Object detection Presented by Minh Hoai Nguyen Date: 28 March 2007 Object detection? What we want Miss a face! Happy face! Scanning window Train a classifier on a fixed size window Outline • Object Detection Using the Statistics of Parts – Schneiderman, H. & Kanade, T. CVPR00, IJCCV04 • Robust Real-time Face Detection – Viola, P. & Jones, M. CVPR01, IJCV04 Bayes optimal classifier • Image is defined by n attrs: x1,x2,…,xn P(Object | x1 ,..., xn ) ?1 P(Object | x1 ,..., xn ) P( x1 ,..., xn | Object ) P(Object ) ?1 P( x1 ,..., xn | Object ) P(Object ) P( x1 ,..., xn | Object ) P(Object ) ? P( x1 ,..., xn | Object ) P(Object ) There are too many parameters to learn Naïve Bayes Assumption • Assume: x1,x2,…,xn are cond. independent. P( x1 ,..., xn | Object ) P( xi | Object ) P( x1 ,..., xn | Object ) P( xi | Object ) Easier to learn • Problem: this might be a bad assumption • Idea: – Carefully divide x1,x2,…,xn into groups: P1, P2,…, Pk – Assume P1, P2,…, Pk are independent P ( x1 ,..., xn | Object ) P ( Pi | Object ) P ( x1 ,..., xn | Object ) P ( Pi | Object ) Independent groups/parts • How to divide x1,x2,…,xn into ind. groups? • Image pixels are highly correlated. • Represent image by Wavelets instead. Wavelet transform 10 filter responses for each original pixel. HL LH HH • Wavelet transform is fully invertible. • Partially de-correlate natural imagery – More independence, easier to design parts Designing parts • Assumption: – Each wavelet coefficient only depends on few others. – Group those coefficients into parts. • Parts: – 17 types, manually defined. – Each part contains 8 coefficients. Categories of parts Intra-subband Interfrequency Local operator Local operator “Parts” Inter-orientation Local operator Inter-frequency/ Inter-orientation Local operator Slide credit: Nicholas Chan Final form of detector How to compute these statistics? Count! Multiple poses? • Other tricks: – Not going to talk about. Reported results for faces • Kodak dataset: – Test set: 17 images, 46 faces, 36 profile views. A bigger dataset • From multiple sources 208 images, 441 faces, about 347 profiles. Robust Real-time Face Detection by Viola,P. & Jones, M. Cascade of classifiers • Most places do not have faces! Simple features Box filters Approximation of Harr-wavelets Integral image Feature evaluation can be done by few lookups Learning the cascade • AdaBoost – Weak classifiers are box filters Learning cascade stages • Using AdaBoost to train each stage: – Adjust threshold to minimize false negatives. – Adding features until target detection and false positive rates are met (determined by CV) Learned cascade • First classifier: – 2 features – 100% detection – 40% false detection • The whole cascade: – 38 stages – 6000 features in total – On dataset with 507 faces and 75 millions sub-windows, faces are detected using 10 feature evaluations on average. – On average, 10 feature evals/sub-window Reported ROC curve Comparison results The end