Discovering Useful Parts for Pose Estimation in Sparsely Annotated Datasets

advertisement
Discovering Useful Parts for Pose Estimation in Sparsely Annotated Datasets
Mikhail Breslav1, Tyson L. Hedrick2, Stan Sclaroff1, and Margrit Betke1
Department of Computer Science1 and Department of Biology2
Boston University1 and University of North Carolina2
Proposed Approach: Discover useful parts from unannotated regions of
Problem: 2D Pose Estimation (Landmark Localization)
Input
training images and use them to improve part appearance likelihood terms.
Desired Output
Experiments: Quantitative Evaluation of MPS (Baseline B), our Proposed
Approach (P), and work from Biology (O) [2] on 211 test images.
Example:
1. Part Discovery: Find parts in unannotated
High Resolution:
• 400 FPS
• 600 x 800 Pixels
Motivation: Biologists ask does Moth flight change with varying wind conditions?
Data: 421images of Hawkmoths with key landmarks (H, AT, LWT, RWT) annotated.
Traditional Approach: Mixture of Pictorial Structures (MPS)
Intuition: Pictorial Structures (PS) [1], model the desired 2D spatial relationship between parts
with a tree model. We use a mixture of PS where each tree models a different subspace of
poses.
Component 1
H
LWT
LWT
AT
Component 2
H
2. Compute Predictiveness: Determine how
predictive a cluster of patches (part) is of a
particular landmark by measuring the
agreement of a cluster on the location of a
landmark relative to the patch center.
Component K
LWT
H
RWT
Root
Leaf
RWT
RWT
regions of training images
• Use Spatial BOW over SIFT
• Greedy Clustering
• Compute Predictiveness to remove outliers
• Learn LDA classifier on WHOG
AT
AT
3. Predict Landmark Locations: Detect
presence of discovered parts in a test image
and allow detected parts to vote for landmark
locations.
4. Integrate with MPS model: Use votes for
Posterior
Appearance Likelihood
Spatial Prior
Number of Components: Obtained by clustering 2D poses seen in training.
Training: Spatial Prior and Appearance Likelihood terms are learned from 210 annotated training
images.
Key Insight: Traditional part-based models rely heavily on annotations, and
can leave much of the available image evidence unused.
landmark locations as an appearance
likelihood term that can be combined with
existing appearance likelihood terms in partbased models such as MPS.
Ongoing Work: Multi-View extension for 3D Pose Estimation
Contact: breslav@bu.edu www.breslav.org Data: www.cs.bu.edu/~betke/research/HRMF/
Acknowledgments: Air Force Office of Scientific Research, the National Science Foundation, and the Office of Naval
Research.
Key References:
[1] P. F. Felzenszwalb and D. P. Huttenlocher. Pictorial structures for object recognition. IJCV, 2005.
[2] V. M. Ortega-Jimenez, R. Mittal, and T. L. Hedrick. Hawkmoth flight performance in tornado-like whirlwind vortices. Bioinspir.
Biomim., 2014
[3] M. Juneja, A. Vedaldi, C. Jawahar, and A. Zisserman. Blocks that shout: Distinctive parts for scene classification. In CVPR, 2013
Cam 1:
Cam 2:
3D Pose
Download