Bag-of-Words based Image Classification Joost van de Weijer What is in the image ? Is there a suit-case ? Is there a person ? Is there car ? image classification: answers the question what is in the image. Inspiration The VOC Pascal challenge: a competition on image classification. Participants have to classify 20 classes in over 10.000 images. Inspiration http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2010/results/index.html The Event Data Set • 7 event classes: basketball, polo, rowing, castells, marathon, sailing, skiing. • each class has 50 images, devided in 30 training and 20 test images. Project I title: Bag-of-Words based Image Classification. goal: build an image classification system which can successfully classify sport images. competition: do so better than the other groups. Why is this difficult ? Variations in viewpoint and zoom. Variations in pose. Why is this difficult ? Inter-class variation. lighting changes. Why is this difficult ? Back-ground variation. similar backgroundsdifferent classes. Maybe the background could help ? from images to frequency histogram •Compute visual words: • detect local regions from a set of images. • describe every local region by a descriptor • texture • color • cluster all descriptors into visual words Given a new image: • detect local regions from a set of image. • assign every region to its nearest visual word. • compute visual word-image histogram assign to visual word N Bag of Visual Words representation Bag-of-Words representation Feature Detection normalize patches No spatial relations. Bag of Visual Words representation pi(w|Miro) pi(w|Dali) The Framework 4. BOW 1. Feature detection Image Image Representation 5. SVM/ distance measures 2. Extraction shape texture color Shape Voc image classification image retrieval shape words Existing Implementation: 1. Feature detection 1.random 4. BOW 4. nearest neighbor Image 2. RGB Image Representation 5. linear SVM 2. Extraction shape texture color image classification image retrieval 50 % classification score 3. random Shape Voc 5. SVM/ distance measures shape words Existing Implementation: properties of BOW implementation: • you can improve any of the subroutines and analyze the changes based on the classification results. • several team members can work on feature detection while others work on feature description. • the final classification results allow us to compare the results between the groups. Project I: Bag Bag-of-Words based Image Classification goal : build an image classification system which can successfully classify sport images. teaching objectives you will learn: • to represent images robust to changes of cameras, object orientation, and illuminant color. • what photometric invariance theory is and how to apply it to a real-world problem. • understand and use the SIFT descriptor. • how to discretize image features (colors, shapes, and textures). • what the strong and weak points of BOW representations for images are. • how to evaluate retrieval and classification results. Practical information: Group Size: The project has to be made in groups of 3 students. Each group should decide on the following roles: • responsible competition. • responsible presentation. • responsible report If it is hard to work as a group you can partition the tasks: • feature detection • feature description • vocabulary construction • learning/evaluation All group-members should understand all steps in the final program ! Practical Information: All practical information can be found in the student guide (http://cat.cvc.uab.es/~joost/master.html ) Practical information: Important Dates: 22 jan - 19 Feb. 22 jan. 29 Jan. 5 Feb. 11 Feb. 12 Feb. 15 Feb. 19 Feb. 22 Feb. : The project will last 1 month. : Start project. : Extra assignment will be handed out. Submission of first results in AP. : Discussion meeting + submission second results in AP. : Publication of final test set. : Discussion meeting with groups separately. : Final submission of classification results in AP for all classes. : Presentation of the project. : Final submission date for report. Supervision: There will be project meetings on Tuesdays afternoon to discus progress. For any questions during the three weeks of the project email (joost@cvc.uab.es) or come to office O/119 in the CVC. Use “PROJECT I” as subject of your emails, which makes it easier to manage. Practical information: Notes The final note will be based on: • participation (15%) • presentation (25%) • report (50%) • competition (10%) Bugs: For sure there will be several bugs in the code. If you find one, mail me, and I will notify the other groups. Thanks ! Practical information: Competition: Dates: 29 Jan. 5 Feb. : Submission of first results in AP (before 15:00). : Submission second results in AP (before 15:00) 19/22 Feb. : Your report/final presentation is based on the labeled test set ! labeled train set labeled test set Practical information: Competition: Dates: 11 Feb. 15 Feb. : Publication of final test set. : Final submission results in AP for all calsses. labeled train set no labels for test set ! Practical information: Final Report The final report has to be submitted on 22th of February. The report should contain the following chapters. • Introduction ( max 1 page ) • Feature Detection (max 2 pages). • Feature Description (max 3 pages). • Visual Vocabulary and BOW representation (max 2 pages) • Classification (max 2 pages) • Object Detection (optional: max 2 pages) • Results (max 2 pages). • Conclusions (max 1 page) What to do next ? • make groups of and assign : •responsible competition (send an email to me today or tomorrow ) • install the programs and play with the code. ( http://cat.cvc.uab.es/~joost/master.html ) • This week you should already start working on a feature detector. What to do next ? Good Luck !