Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection Joseph J. Lim (MIT), C. Lawrence Zitnick (MSR), Piotr Dollár (MSR) Overview Method Goal: learn and detect local contour-based representation for mid-level features Defining Sketch Tokens Detecting Sketch Tokens Given a set of sketch token classes, our goal is to detect them in color images. We are given a set of images, I, and its corresponding set of binary contour images, S. Sketch Tokens: • Local edge structures (e.g. straight lines, t-junctions, y-junctions) • Discovered from human-generated image sketches Each color patch’s ground truth class is assigned to one of Sketch Token or background class. Sketch Tokens are clusters of extracted patches from the binary contour images S. We demonstrate our approach on both top-down and bottom-up tasks. t2 We used random forest classifier with various features (e.g. CIE-LUV intensity, orientation, and self-similarity). t4 t8 - Each patch has a fixed size of 35x35, and its center pixel must be on a labeled contour - 150 clusters are extracted using K-means on Daisy descriptors computed on binary patches. • State-of-the-art result on contour detection, while 200x faster • Large improvements on object and pedestrian detection. t1 t3 t5 t9 Sketch Tokens Contour Detection (BSDS 500) Object Detection on PASCAL2007 INRIA Pedestrian Detection We used Sketch Token responses (150 st + 1 bg dimension) on images as additional features to the deformable parts model detector. In addition to standard features used in Dollár et. al.’s implementation, we added Sketch Token responses. On average, we improved 3.8 AP. Speed bike bird boat bottle bus car cat chair cow LUV+M+O 10 17.2% HOG 19.7 43.9 2.2 4.8 13.4 36.6 40.2 5.4 10.9 15.7 ST 151 19.5% ST 17.8 41.1 4.8 5.7 11.1 31.9 33.8 5.1 10.8 16.1 ST+LUV+M+O 161 14.7% HOG+ST 21.9 48.5 6.3 6.4 14.6 41.5 43.3 6.1 15.7 19.2 OIS AP Human 0.80 0.80 - Canny 0.60 0.64 0.58 1/15s Method table dog horse moto person plant sheep sofa train tv gPb 0.73 0.76 0.73 240s HOG 7.5 2.1 41.9 30.9 23.9 3.4 9.3 14.8 26.9 32.4 ST 7.4 3.1 32.9 27.0 20.9 4.6 8.6 10.4 18.9 26.3 HOG+ST 14.2 3.8 46.1 34.5 30.9 8.1 15.3 18.9 30.3 36.6 0.74 0.76 0.77 280s Sketch tokens 0.73 0.75 0.78 1s miss rate plane ODS SCG # channels Method Method 200x faster! Method Conclusion MATLAB code is available on the website t6 t7 t14 t15