Week 2

Data Driven Attributes for Action Detection Week 2 Presented by Christina Peterson Background  Liu et al. [1] propose a unified framework for action recognition where manually specified attributes are:  Selected discriminatively to account for intra-class variability  Integrated with data-driven attributes to make the attribute set more descriptive  Yu et al. [2] propose a framework for an attribute-based query by using a large pool of weak attributes composed of automatic classifier scores that are easily acquired with no human labor.  Query attributes are acquired by human labeling process  Weak attributes are generated automatically by machine  Query attributes are mapped to weak attributes Background  Malisiewicz et al. [3] propose a method for object detection which combines a discriminative object detector with a nearest-neighbor approach.  A separate linear SVM classifier is trained for each exemplar in the dataset  Each exemplar is represented using a rigid HOG template  This results in a large collection of simple individual Exemplar-SVM detectors rather than a single complex category detector  Farhadi et al. [4] propose an attribute based approach to object detection.  Semantic and discriminative attributes  Feature selection method for learning attributes that can be generalized across categories  Base feature definition Background  Tian et. al. [5] proposes a spatiotemporal deformable part model (SDPM) that stays true to the structure of the original deformable part model (DPM).  SDPM has volumetric parts that displace in both time and space  Root filter used to capture the overall information of the action cycle and is obtained by applying an SVM on the HOG3D features of the action cycle Low Level Features  STIP  Histogram of Oriented Gradient (HOG)  72 element descriptor  Histogram of Optical Flow (HOF)  90 element descriptor  Color  Texture Bag of Words  Concatenate Low Level Features for each video  Cluster Features by Kmeans  128 for color, 256 for texture, 1000 for STIP  Each feature type will be clustered separately by Kmeans  3 x 3 x 3 + 1 = 28 cells  Features collected for each cell  Create Histogram of cluster centers per feature for each cell in bounding box  (128 + 256 + 1000) x 28  Normalize based on size of bounding box Exemplar SVM  Train a separate linear SVM classiﬁer for each exemplar in the dataset with a single positive example and many negative examples  This results in a large collection of simple individual Exemplar-SVM detectors rather than a single complex category detector  Example: The action Diving-side will have multiple linear SVM classifiers each based on a positive example within this action class  Test set will need to run all Exemplar-SVM detectors for the respective action class to calculate label prediction accuracy Goals  Implement the Exemplar SVM classifiers in matlab  Label Propagation  Finding relationship between labels and prediction results  Conditional Probability References [1] J. Liu, B. Kuipers, and S. Savarese. Recognizing Human Actions by Attributes. In CVPR, 2011. [2] F. Yu, R. Ji, M.-H. Tsai, G. Ye, and S.-F. Chang. Weak Attributes for Large-Scale Image Retrieval. In CVPR, 2012. [3] T. Malisiewicz, A. Gupta, and A. A. Efros. Ensemble of Exemplar SVMS for Object Detection and Beyond. In Proc. ICCV, 2011. [4] H. Farhadi, I. Endres, D. Hoiem, and D. Forsyth. Describing objects by their attributes. In CVPR, 2009. [5] Y. Tian, R. Sukthankar, and M. Shah. Spatiotemporal Deformable Part Models for Action Detection. In CVPR, 2013. [6] Y. Wang and G. Mori. Hidden Part Models for Human Action Recognition: Probabilistic vs. Max-Margin. In PAMI, 2011.

Week 2

Related documents

Products

Support

Week 2

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib