PhD thesis Proposal INRIA Sophia Antipolis, PULSAR group 2004, route des Lucioles, BP93 06902 Sophia Antipolis Cedex – France 1. Title People tracking and video understanding. 2. Scientific context PULSAR group works on automatic sequence video interpretation. The “SUP” (“Scene Understanding Platform”) developed in PULSAR, detects mobile objects, tracks their trajectories and recognises related behaviours defined by experts. This platform contains techniques for the detection and recognition of human postures and gestures of one person. However there are scientific challenges in people tracking when dealing with real word scenes: crowded and cluttered scenes, handling wrong and incomplete person segmentation, handling static and dynamic occlusions, and combination of several cameras to manage ambiguities due to the person orientation. 3. General objectives of the PhD thesis This work consists in building a framework to conceive of a robust, real-time people tracker in a real world video scene. Many techniques have already been proposed for people tracking in specific environment (e.g. Laboratory) using the cooperation of several sensors (e.g. camera network, individual equipped with markers). Despite these strong hypotheses, people tracking is still brittle and often depends on the detection of the individual depending on the noise contained in the data. This work aims at reducing these hypotheses in order to conceive a general algorithm enabling the tracking of individuals moving in an unconstrained environment and observed through a limited number of cameras. Thus, the designed tracker should be able to adapt itself to many situation such as 2D or 3D scenes and network of overlapping or non-overlapping cameras. The goal is to combine a set of short-term trackers to improve the global tracking of individuals while detecting wrong and incomplete segmentation inputs. To track individuals we will explore techniques based on people appearance using for instance local descriptors such as SURF, Hu Moments, skin colour histogram, HOG, Haar features, covariance matrix, Omega descriptor, and 3D dimension. We will investigate, integrate tracking algorithms from other teams. We will also envisage particles filter or MHT to track an individual on a short time interval. The robustness of the new tracker will be provided by an online self-evaluation system based on tracking and events feedback. To validate the PhD thesis, we will assess the propose approach on homecare videos to evaluate technologies to keep older adults functioning and living autonomously. Some specific applications like 3D tracking with a Kinect camera for instance or the realisation of serious games can be planned. 4. Pre-requisites: Computer Science, Strong background in C++ programming, Design Pattern, Linux, debugging tools, cognitive vision and Machine Learning. 5. Schedule 1st year: Study of existing solutions, integrate tracking algorithms from other teams. Proposing an original algorithm for people tracker. Proposing a segmentation evaluation algorithm. 2nd year: Proposing an algorithm for controlling the people tracker process given scene information. Evaluate and optimise proposed algorithm. 3rd year: Writing papers, PhD manuscript. 6. Bibliography: Alberto Avanzi, Francois Bremond, Christophe Tornieri and Monique Thonnat, Design and Assessment of an Intelligent Activity Monitoring Platform, in EURASIP Journal on Applied Signal Processing, special issue in "Advances in Intelligent Vision Systems: Methods and Applications", 2005. E. Corvee and F. Bremond, “Combining face detection and people tracking in video surveillance”, 3rd International Conference on Imaging for Crime Detection and Prevention, ICDP 09, Kingston University, London, UK, 3rd December 2009. Liebe, B. and Schiele, B. “Interleaved object categorization and segmentation” In British Machine Vision Conference (BMVC'03). Norwich, UK, 2003. M. Siala, N. Khlifa, F. Bremond, and K. Hamrouni. “People detection in complex scene using a cascade of Boosted classifiers based on Haar-like-features”. In the IEEE Intelligent Vehicle Symposium (IEEE IV 2009), Xi'an, Shaanxi, China, Jun 3-5, 2009. X. Rong Li and V. P.Jilkov, “A survey of manoeuvring target tracking-part v: Multiple-model methods,” IEEE Transactions on Aerospace and Electronic Systems, 2003. D.P. Chau, F. Bremond, E. Corvee and M. Thonnat. “Robust mobile object tracking based on multiple feature similarity and trajectory filtering” In the International Conference on Computer Vision Theory and Applications (VISAPP 09), Algarve, Portugal, 5-7 March, 2011. J.C. Avedillo, “Advances in semantic-guided and feedback-based approaches for video analysis”, PhD thesis, Universidad Autónoma de Madrid, September 2011 S. Bak, E. Corvee, F. Bremond, M. Thonnat, “Multiple-shot Human Re-Identification by Mean Riemannian Covariance GriD”. In Advanced Video and Signal-Based Surveillance (AVSS 2011), Klagenfurt, Austria, 30 August-2 September 2011