HUMAN POSE ESTIMATION Narendran Anil EENG 512 Computer Vision Introduction Human pose estimation involves the estimation of the human body from an image. This process recovers the relative positions and orientations of body parts by analyzing the image. Introduction (cont.) Human pose estimation result (http://www.robots.ox.ac.uk/~tvg/projects/rf_pose_det/) Challenges The human body pose is not easy to predict. The pose of the human body keeps changing continuously while performing activities like walking or running. The pose of the human body does not follow any known algorithm unlike mechanical systems. Approach The most popular method used for human pose estimation is a model based approach. The human body is visualized as a tree. The body parts are vertices. Joints are the edges. Human model tree H – Head T – Torso RUA – Right upper arm RLA – Right lower arm LUA – Left upper arm LLA – Left lower arm RUL – Right Upper leg RLL – Right Lower leg LUL – Left Upper leg LLL – Left Lower leg Model based methods There are predominantly two ways to perform model based pose estimation on the image 1. Top-down method 2. Bottom-up method Top-down approach The torso (root) is detected first Rest of the body parts are found once the torso is detected. Eg: Pictorial structures algorithm This involves an energy minimization problem described as Maximum a posteriori (MAP) This detects only a best match of the human model in an image Pictorial Structures Algorithm The pictorial structures algorithm considers the human body as a graph with the vertices denoting the body parts and the edges showing the connections between the parts. It then finds the energy minimization for finding all the parts of the human body. L* is the energy minimization is the cost of locating part ‘i’ at location ‘l’ is the cost of locating parts ‘i’ and ‘j’ at li and lj based on their relative locations These cost functions can be specified as Bayesian probability functions and the above minimization becomes a MAP estimation problem Bottom-up approach Detects potential parts first irrespective of hierarchy and builds up the pose from bottom of the tree and then moving upwards. This algorithm has two steps 1. 2. Part detection Pose estimation of detected parts Part detection This is accomplished by using a contour detection algorithm. The parts are characterized by a pair of line segments – boundary of the part in question Part Detection (cont.) For contour detection, Pb operator is used Pb operator computed the probability of the boundary at each pixel in the image by analyzing the texture, brightness and color around that pixel. Part detection is performed by using a logistic classifier. (NETLAB toolbox) Pose is calculated by integer quadratic programming (Optimization toolbox) Results These are the results obtained by applying the bottom up method on a set of test images The boundary detection (Pb operator) is done by using Berkeley Segmentation Engine Logistic Classifier from Netlab toolbox for Matlab Integer Quadratic Programming by using Matlab’s optimization toolbox. Reference [1] Lucas, B., and Kanade, T. An iterative image registration technique with an application to stereo vision. In IJCAI81 (1981), pp. 674-679. [2] Srinivasan, P., and Shi, J. Bottom-up recognition and parsing of the human body. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2007). [3]Ren, X., Berg, A. C., and Malik, J. Recovering human body con_gurations using pairwise constraints between parts. In ICCV '05: Proceedings of the Tenth IEEE Inter-national Conference on Computer Vision (ICCV'05) Volume 1 (Washington, DC, USA,2005), IEEE Computer Society, pp. 824-831. [4]Ramanan, D. 'Learning to parse images of articulated bodies. In Advances in Neural Information Processing Systems 19', B. Sch• olkopf, J. Platt, and T. Ho_man, Eds. MIT Press, Cambridge, MA, 2007, pp. 1129-1136. [5]Levin, A., and Weiss,Y. 'Learning to combine bottom-up and top-down segmentation.‘ In ECCV (4) (2006), pp. 581-594. [6] Bray, M., Kohli, P., and Torr, P. H. S. Posecut: 'Simultaneous segmentation and 3D pose estimation of humans using dynamic graph-cuts.' In ECCV (2) (2006), pp. 642{655. [7] Antani, L. and Chandran, S. 'Skeleton-Based Pose Estimation of Human Figures.' Department of Computer Science and Engineering, Indian Institute of Technology, Bombay [8] Felzenszwalb, P. F., and Huttenlocher, D. P. 'Pictorial structures for object recognition.‘ Int. J. Comput. Vision 61, 1 (2005), 55-79.