Slides

advertisement
Indoor Scene Classification
Anuja Ranjan
Manav Garg
Prof: Dr. Amitabha Mukherjee
Problem
• Classifying indoor scenes is a challenging task due to the large variation across
different examples within each class and similarities between different classes.
• Besides spacial properties indoor classification requires us to see the objects they
contain for a good accuracy.
Related Work
In a recent work by Toralba and Quattoni region of interests are extracted from images
and compared.
They do not use objects in their approach but do mention that some indoor scenes are
better classified by the objects they contain.
Img source: Ariadna Quattoni and Antonio Torralba: ”Recognizing Indoor Scenes,”IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009.
Towards classification via objects
• Training Phase : Extracting features and building object classifiers
Extracted HOG features and gabor features for images of each of the 4 object
classes.
Images obtained from Caltech Dataset and Google Images
Use AdaBoost algorithm to combine these weak classifiers to develop a strong
classifier.
• Testing Phase : Detecting objects in test image and predicting the scene
probability
Used sliding window method with 5 shapes to detect objects in the scene.
The method is additionally improved by the use of 3D features of image.
With the detections, confidence files and the prior probabilities of objects in the
scene we classify the image into scenes.
Results
Office: 68.26
Hall: 14.31
Conference: 17.43
Visual Prob: 0.786
3D Prob: 0.915
Office: 37.33
Hall: 30.33
Conference: 32.34
Office: 18.53
Hall: 22.84
Conference: 58.61
Ref: P. Espinace, T. Kollar, A. Soto, and N. Roy: ”Indoor scene recognition through object detection,” IEEE International Conference on Robotics and
Automation (ICRA), 2010.
Improving the work
• Use Gist Features and SVM to develop object classifiers.
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International
Journal of Computer Vision, 42, 145-175.
Improving the work
• Use Gist Features and SVM to develop object classifiers.
WHY GIST???WHAT’S NEW???
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International
Journal of Computer Vision, 42, 145-175.
Improving the work
• Use Gist Features and SVM to develop object classifiers.
WHY GIST???WHAT’S NEW???
• Gist features develop low dimensional representation of a scene which
doesn’t require segmentation or object recognition but use perceptual
features.
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International
Journal of Computer Vision, 42, 145-175.
Improving the work
• Use Gist Features and SVM to develop object classifiers.
WHY GIST???WHAT’S NEW???
• Gist features develop low dimensional representation of a scene which
doesn’t require segmentation or object recognition but use perceptual
features.
• The result obtained by Gist classification show a considerable
improvement in the context based recognition and classification.
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International
Journal of Computer Vision, 42, 145-175.
Improving the work
• Use Gist Features and SVM to develop object classifiers.
WHY GIST???WHAT’S NEW???
• Gist features develop low dimensional representation of a scene which
doesn’t require segmentation or object recognition but use perceptual
features.
• The result obtained by Gist classification show a considerable
improvement in the context based recognition and classification.
• They haven’t yet been used as such in this classification problem.
Source: Oliva, A., & Torralba, A. (2001) “Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope” International
Journal of Computer Vision, 42, 145-175.
Comparison of Object Classifiers
• Average overall accuracy achieved by the HOG classifier ~ 62%
• Our classifier ~ 75%
Comparison of Object Classifiers
• Average overall accuracy achieved by the HOG classifier ~ 62%
• Our classifier ~ 75%
HOG features:
Monitor:0.786
Monitor: 0.052
Screen: 0.621
Gist features:
Monitor:0.65
Monitor: 0.15
Screen: 0.028
Appendix
They record the occurrence of gradients in localized area of image.
Uses the fact that an object can be described by the distribution of its intensity
gradient.
Gabor features
•
•
•
•
Gabor filters are used for edge detection ,texture representation,etc.
Filters with different frequencies and orientations are used for feature extraction.
2D Gabor filter is a Gaussian kernel function.
Also used for sparse object representation
SR Ranger Method
•
•
•
•
The dataset consists of pair of visual and 3D images. The depth in 3D image is
estimated per pixel from measuring the time difference between the signal sent
and received.
3D models are used to compute the probability of matching the geometric
properties of a given object and the information present in a given window (Size,
Height) to improve object classification.
This takes into account the spatial properties.
In order to find an object the probability of visual and 3D image both should be
above a threshold.
Download