PhD Proposal Farhood NEGIN INRIA Sophia Antipolis, STARS group 2004, route des Lucioles, BP93 06902 Sophia Antipolis Cedex – France http://www-sop.inria.fr/members/Francois.Bremond/ 1. Title People detection for activity recognition using RGB-Depth sensors 2. Scientific context STARS group works on automatic sequence video interpretation. The “SUP” (“Scene Understanding Platform”) Platform developed in STARS, detects mobile objects, tracks their trajectory and recognises related behaviours predefined by experts. This platform contains several techniques for the detection of people and for the recognition of human postures and activities of one or several persons using conventional cameras. However there are scientific challenges in people detection when dealing with real word scenes with apathetic patients: cluttered scenes, handling wrong and incomplete person segmentation, handling static and dynamic occlusions, low contrasted objects, moving contextual objects (e.g. chairs) ... Moreover new sensors have been released improving people detection. For instance, thanks to Microsoft and its Kinect sensor, depth camera became popular and accessible. The basic idea of depth camera is to combine an IR camera with an IR structured light to determine the depth of each image pixel. This kind of sensor is well adapted for applications which monitor people (e.g. monitoring Alzheimer patient in hospital): because the people are in a predefined area and near the camera. The depth cameras have two main advantages: first, the output images contain depth information (stereoscopic) and second, the sensor is independent on the light changes (IR sensor). In our work, we propose to use the Kinect or Asus sensor to acquire 3D images, detect the people and recognize interesting activities. The nestk library is used to manage the Kinect sensor. This library is based on a framework similar to OpenNI (an open source driver) to acquire the image. Moreover the library is able to compute some treatments (e.g. detection of the people) and to provide a true 3D map of the scene in the referential of the RGB-Depth sensor. 3. General objectives of the PhD This work consists in the design of novel algorithms for people detection using RGB-Depth sensors (e.g. Kinect) and for activity recognition, in order to help apathetic patients to improve their conditions of life. Many techniques have already been proposed for detecting people in specific environment (e.g. low density laboratory) using the cooperation of several sensors (e.g. camera network, individual equipped with markers, accelerometer). Despite these studies, people detection is still brittle with conventional cameras and often depends on the position of the individual relatively to the cameras or is limited (about 4 - 5 meters) with RGBDepth sensors. This work aims at reducing these hypotheses in order to conceive a general algorithm enabling the detection of an individual living in an unconstrained environment and observed through a limited number of cameras including RGB-D sensors. The goal is to review the literature, evaluate existing libraries, propose and assess new algorithms. To learn new people detectors, we could explore techniques based on people appearance and silhouette using for instance local descriptors such as SURF, Hu Moments, skin colour histogram, MESR, LBP, HOG, Haar features, covariance matrix, Omega descriptor..., in the 3D depth map. To validate the PhD we will asses the propose approach on homecare 3D videos from Nice Hospital to evaluate algorithms to keep older adults functioning at higher levels and living independently. This PhD will be conducted in the PAL framework (https://pal.inria.fr/ ). 4. Pre-requisites: Computer Vision, Strong background in C++ programming, Linux, artificial intelligence, cognitive vision, 3D geometry and Machine Learning. 5. Schedule –2014-2017 1st year: Study the limitations of existing algorithms. Proposing an original algorithm for people detection. 2nd year: Proposing an original algorithm for activity recognition. Evaluate and optimise proposed algorithms. 3rd year: Writing papers and PhD manuscript. 6. Bibliography: Y. Yang, D. Ramanan. "Articulated Human Detection with Flexible Mixtures of Parts" IEEE Pattern Analysis and Machine Intelligence (PAMI). To appear 2013. Alberto Avanzi, Francois Bremond, Christophe Tornieri and Monique Thonnat, Design and Assessment of an Intelligent Activity Monitoring Platform, in EURASIP Journal on Applied Signal Processing, special issue in "Advances in Intelligent Vision Systems: Methods and Applications", 2005. J. Joumier, R. Romdhane, F. Bremond, M. Thonnat, E. Mulin, P.H. Robert, A. Derreumeaux, J. Piano and L. Lee. Video Activity Recognition Framework for assessing motor behavioural disorders in Alzheimer Disease Patients. In the International Workshop on Behaviour Analysis, Behave 2011, Sophia Antipolis, France on the 23rd of September 2011. E. Corvee and F. Bremond. Haar like and LBP based features for face, head and people detection in video sequences. In the International Workshop on Behaviour Analysis, Behave 2011, Sophia Antipolis, France on the 23rd of September 2011. Liebe, B. and Schiele, B. “Interleaved object categorization and segmentation” In British Machine Vision Conference (BMVC'03). Norwich, UK, 2003. Wang, Peihua and Zhang, "Histogram feature-based Fisher linear discriminant for face detection", Neural Computing and Applications, Volume 17, Issue 1, Pages: 49 - 58, November 2007. E. Corvee and F. Bremond, “Combining face detection and people tracking in video surveillance”, 3rd International Conference on Imaging for Crime Detection and Prevention, ICDP 09, Kingston University, London, UK, 3rd December 2009. David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110 7. Contact: Francois.Bremond@inria.fr