PRIMA Perception Recognition and Integration for Observing and Modeling Activity James L. Crowley, Prof. I.N.P. Grenoble Augustin Lux, Prof. I.N.P. Grenoble Patrick Reignier, MdC. Univ. Joseph Fourier Dominique Vaufreydaz, MdC UPMF 1 The PRIMA Group Leaders Doms, Jim, Patrick and Augustin 2 The PRIMA Group Members Trombinoscope 3 The PRIMA Group, May 2006 Permanents : James L. Crowley, Prof. I.N.P. Grenoble Augustin Lux, Prof. I.N.P. Grenoble Patrick Reignier, MdC. U.J.F. Dominique Vaufreydaz, MdC. UPMF. Assistante : Caroline Ouari (INPG) Contractual Engineers Alba Ferrer, IE INRIA Mathieu Langet, IE INPG 4 The PRIMA Group, May 2006 Doctoral Students : Stan Borkowski (Bourse EGIDE) Chunwiphat, Suphot (Bourse Thailand) Thi-Thanh-Hai Tran (Bourse EGIDE) Matthieu Anne (Bourse CIFRE - France Telecom) Olivier Bertrand (Bourse ENS Cachan) Nicolas Gourier (Bourse INRIA) Julien Letessier (Bourse INRIA) Sonia Zaidenberg (Bourse CNRS - BDI) Oliver Brdiczka (Bourse INRIA) Remi Emonet (Bourse MENSR) 5 Plan for the Review 1) Presentation of Scientific Project Objectives Research Problems and Results Bilan 2003 - 2006 Evolutions for 2007-2010 6 Objective of Project PRIMA Develop the scientific and technological foundations for context aware, interactive environments Interactive Environment: An environment capable of perceiving, acting, communicating, and interacting with users. 7 Experimental Platforme : FAME Augmented Meeting Environment 8 Cameras 7 Steerable 1 fixed, wide angle 8 Microphones (acoustic Sensors) 6 Biprocessors (3 Ghz) 3 Video Interaction Devices (Camera-projector pairs) January 06: Inauguration of new Smart Environments Lab (J 104) 8 Augmented Meeting Environment QuickTime™ and a decompressor are needed to see this picture. 9 Research Problems • Context-aware interactive environments • New forms of man-machine interaction (using perception) • Real Time, View Invariant, Computer Vision • Autonomic Architectures for Multi-Modal Perception 10 Research Problems • Context-aware interactive environments • New forms of man-machine interaction (using perception) • Real Time, View Invariant, Computer Vision • Autonomic Architectures for Multi-Modal Perception 11 Software Architecture for Observing Activity User Services Situation Modeling Perceptual Components Logical Sensors, Logical Actuators Sensors, Actuators, Communications Sensors and actuators: Interface to the physical world. Perception and action: Perceives entities, Assigns entities to roles. Situation: Filter events, Describes relevant actors and props for services. 12 (User) Services: Implicit or explicit. Event driven. Situation Graph Situation Graph Situation-1 Situation-3 Situation-5 Situation-6 Situation-2 Situation-4 Situation: An configuration of entities playing roles Configuration: Set of Relations (Predicates) over entities. Entity: Actors or Objects Roles: Abstract descriptions of Persons or objects A situation graph describes a state space of situations and the actions of the system for each situation 13 Situation and Context Basic Concepts: Property: Entity: Composite entity: Relation: Actor: Role: Any value observed by a process A “correlated” set of properties A composition of entities A predicate defined over entities An entity that can act. Interpretation assigned to an entity or actor Situation: A configuration of roles and relations. 14 Situation and Context Role: Relation: Situation: Interpretation assigned to an entity or actor A predicate over entities and actors An configuration of roles and relations. A situation graph describes the state space of situations and the actions of the system for each situation Approach: Compile a federation of processes to observe the roles (actors and entities) and relations that define situations. 15 Acquiring Situation Models Objective: Automatic acquisition of situation models. Approach: Start with simple sterotypical model for scenario Develop using Supervised Incremental Learning Recognition: Detect Roles with Linear Classifiers Recognize Situation using probablisitic model 16 Video Acquisition System V2.0 Process Supervisor Situation Modeling Event Bus Face Detection Audience Camera Streaming Video Camera Audio-Visual Composition MPEG Speaker Tracker Vocal Activity Detector M i c M i c New Slide Detection New Person Detection Camera Steerable Camera 1 Projector Wide Angle Camera 17 Audio-Visual Acquisition System QuickTime™ and a decompressor are needed to see this picture. Version 1.0 - January 2004 18 Research Problems • Context-aware interactive environments • New forms of man-machine interaction (using perception) • Real Time, View Invariant, Computer Vision • Autonomic Architectures for Multi-Modal Perception 19 Steerable Camera Projector Pair 20 QuickTime™ and a decompressor are needed to see this picture. 21 Portable Display Surface QuickTime™ and a decompressor are needed to see this picture. 22 Rectification by Homography (x, y) (x', y') wx h11 wy h21 w h31 wx x w h12 h22 h32 h13 x y h 23 h 33 1 wy y w For each rectified pixel (x,y), project to original pixel and compute interpolated intensity 23 Real Time Rectification for the PDS 24 Luminance-based button widget S. Borkowski, J. Letessier, and J. L. Crowley. Spatial Control of Interactive Surfaces in an Augmented Environment. 25 In Proceedings of the EHCI’04. Springer, 2004. Striplet – the occlusion detector x y R (t ) f f gain gain ( x , y ) L( x , y , t ) dxdy ( x , y ) dxdy 0 Gain x 26 Striplet – the occlusion detector x R 0 y 27 Striplet – the occlusion detector x R 0 y 28 Striplet-based SPOD SPOD – Simple-Pattern Occlusion Detector 29 Projected Calculator QuickTime™ and a YUV420 codec decompressor are needed to see this picture. 30 Research Problems • Context-aware interactive environments • New forms of man-machine interaction (using perception) • Real Time, View Invariant Computer Vision • Autonomic Architectures for Multi-Modal Perception 31 Chromatic Gaussian Basis GxL GC1 GC2 0 C1 Gx C2 Gx k GL xx L G xy L G yy Normalized in Scale and Orientation to Local Neighborhood 32 Real Time, View Invariant Computer Vision Results • Scale and orientation normalised Receptive Fields computed at video rate. (BrandDetect system, IST CAVIAR) • Real time indexing and recognition (Thesis F. Pelisson) • Robust Visual Features for Face Detection (Thesis N. Gourier) • Direct Computation of Time to Crash (Masters A. Negre) • Natural Interest "Ridges" 33 Scale and Orientation Normalised Gaussian RF's Intrinisic Scale: Peak in Laplacian as a function of Scale. i (i, j) Arg Max{ 2 G() A(i, j) } Oriented Response can be obtained as a weighted sum of cardinal derivatives <A(i,j) G()> = <A(i,j) Gx()> Cos() + <A(i,j) Gy() > Sin() Normalisation of scale and orientation provides invariance to distance and camera rotation. 34 Natural Interest Points (Scale Invariant "Salient" image features) Local extrema of < 2G(i,j,)•A(i,j)> over i, j, Problems with Points • Elongated shapes • Lack of discrimination power • No orientation information Proposal: Natural Interest Ridges Maximal ridges in Laplacian Scale Space: 35 Natural Ridge Detection [Tran04] 2 2 f f 2 f 2 2 x y Laplacian Hessian Compute Derivatives at different Scales. For each point (x,y,scale) Compute second derivatives: fxx,fyy,fxy Compute eigenvalues and eigenvectors of Hessian matrix Detect local extremum in the direction corresponding to the largest eigenvalue. Assemble Ridge points, 36 QuickTime™ and a decompressor are needed to see this picture. 37 Real Time, View Invariant Computer Vision Current activity • Robust Visual Features for Face Detection • Direct Computation of Time to Crash • Natural Interest "Ridges" for perceptual organisation. 38 Research Problems • Context-aware interactive environments • New forms of man-machine interaction (using perception) • Real Time, View Invariant, Computer Vision • Autonomic Architectures for Multi-Modal Perception 39 Supervised Perceptual Process Events Con figu ration Requests for state Events Cu rrentSt ate Respons e to comm ands Autono mic Supe rvisor Time Detection Prediction ROI,S, DetectionM ethod Vi deo S tream ROI,S, DetectionM ethod Obse rvation Mo Obse rvation dules Mo dules Obse rvation Es timation Ent ities Intep retat ion Acto rs Modul es Supervisor Provides: Execution Scheduler Parameter Regulator Capabilities • Command Interpreter • Description of State and 40 Detection and Tracking of Entities QuickTime™ and a decompressor are needed to see this picture. Entities: Correlated sets of blobs Blob Detectors: Backgrnd difference, motion,color, receptive fields histograms Entity Grouper: Assigns roles to blobs as body, hands, face or eyes 41 Autonomic Properties provided by process supervisor Auto-regulatory: The process controller can adapt parameters to maintain a desired process state. Auto-descriptive: The process controller provides descriptions of the capabilities and the current state of the process. Auto-critical: Process estimates confidence for all properties and events. Self Monitoring: Maintaining a description of process state and quality of service 42 Self-monitoring Perceptual Process Video Process Model Error? Model Learning Unknown Errors Error Classification Error Recovery Perceptual Process • Process monitors likelihood of output • When an performance degrades, process adapts processing (modules, parameters, and data) 43 Autonomic Parameter Regulation Operator Video Stream Parameter Regulator System Parameters Pixel-level Detection Operator Input Tracked Entities Entities Recognition Training Entity Database Parameter regulation provides robust adaptation to Changes in operating conditions. 44 Research Contracts (2003-2006) National and Industrial: ROBEA HR+ : Human-Robot Interaction (with LAAS and ICP) ROBEA ParkNav: Perception and action dynamic environments RNTL ContAct: Context Aware Perception (with XRCE) Contract HARP (Context aware Services - France Telecom) IST - FP VI: Projet IST IP - CHIL : Multi-modal perception for Meeting Services IST - FP V: Project IST - CAVIAR: Context Aware Vision for Surveillance Project IST - FAME: Multi-modal perception for services Project IST - DETECT : Publicity Detection in Broadcast Video Project FET - DC GLOSS : Global Smart Spaces Thematic Network: FGNet (« Face and Gesture ») Thematic Network: ECVision - Cognitive Vision 45 Collaborations INRIA Projects EMOTION (INRIA RA): Vision for Autonomous Robots; ParkNav, ROBEA (CNRS), Theses of C. Braillon and A. Negre ORION (Sophia): Cognitive Vision (ECVision), Modeling Human Activity Academic: IIHM, Laboratoire CLIPS: Human-Computer Interaction, Smart Spaces; Mapping Project, IST Projects GLOSS, FAME, Thesis: J. Letissier Univ. of Karlsruhe (Multimodal interaction): IST FAME and CHIL. Industry France Telecom: (Lannion and Meylan) Project HARP, Thesis of M. Anne. Xerox Research Centre Europe: Project RNTL/Proact Cont'Act IBM Research (Prague,New York): Situation Modeling, Autonomic Software Archictures, Projet CHIL 46 Knowledge Dissemination Journal Thesis Conf & Wkshp 1 Patents 6 Chapters 12 10 8 6 4 2 0 2003 2004 2005 2006 47 Conferences and Workshops Organised General Chairman (or co-chairman) Conference: SoC-EuSAI 2005 Workshops: Pointing 2004, PETS 2004, Harem 2005 Program Co-Chairman International Conference on Vision Systems, ICVS 2003, European Symposium on Ambient Intelligence, EuSAI 2004, International Conference on Multimodal Interaction, ICMI 2005. Program Committee/Reviewer: UBICOMP 2003, ScaleSpace 2003, sOc 03, ICIP 03, ICCV 03 AMFG 04, ICMI 03, RFIA 2004, IAS 2004, ECCV 2004,FG 2004, ICPR 2004, CVPR 2004, ICMI 2004, EUSAI 2004, CVPR 2005, ICRA 2005, IROS 2005, Interact 2005, ICCV05, ICVS 06, PETS 05, FG06, ECCV06, CVPR06, ICPR06, IROS06… 48 APP Registered Software 1) CAR : Robust Real-Time Detection and Tracking APP IDDN.FR.001.350009.000.R.P.2002.0000.00000 Commercial License to BlueEyeVideo 2) BrandDetect: Detection, tracking and recognition of commercial trademarks in broadcast video APP IDDN.FR.450046.000.S.P.2003.000.21000 Commercial License to HSArt 3) ImaLab: Vision Software Development Tool. Shareware, APP under preparation. Distributed to 11 Research Laboratories in 7 EU Countries 4) Robust Tracker v3.3 (stand-alone) 5) Robust Tracker v3.4 (Autonomic) 6) Apte: Monitoring, regulation and repair of perceptual systems. 7) O3MiCID: Middleware for Intelligent Environments 49 Start-up: Blue Eye Video PDG: Marketing : Engineers : Councelor : Incubation: Creation : Market: Sectors: Status: Pierre de la Salle Jean Viscomte Stephane Richetto Pierre-Jean Riviere Fabien Pelisson Dominique de Mont (HP) Sebastien Pesnel James L. Crowley INRIA Transfer, GRAIN, UJF Industrie. Region Rhône Alpes Lauréat de concours création d'enterprise 1 June 2003 Observation of human activity Commercial services, Security, and traffic monitoring 386 K Euros in sales in 2005, >100 Systems installed 50 Blue Eye Video Activity Sensor (PETS 2002 Data) QuickTime™ and a decompressor are needed to see this picture. 51 46 Blue Eye Video Activity Sensor (Distributed Sensor Networks) QuickTime™ and a YUV420 codec decompressor are needed to see this picture. 52 Evolutions for 2006-2010 Context-aware interactive environments • Adaptation and Development of Activity Models New forms of man-machine interaction • Affective Interaction Real Time, View Invariant, Computer Vision • Embedded View-invariant Visual Perception Autonomic Architectures for Multi-Modal Perception • Learning for Monitoring and Regulation • Dynamic Service Composition 53 Automatic Adaptation and Development of Models for Human Activity S ituation Network Supervi sor Feedback Learning (re)actions Learning Situations Splitting Situations Deleting obsolete Situations Learning Roles Adaptation: consistent behaviour across environments Development: Acquisition of new abilities 54 Affective interaction Interactive objects that recognize interest and affect and that learn to perceive and evoke emotions in humans. 55 Embedded View-invariant Visual Perception Embedded Real Time View Invariant Vision in phones and PDA’s (Work with ST MicroSystems) 56 Distributed Autonomic Systems for Multi-modal Perception User Services Situation Modeling Perceptual Components Logical Sensors, Logical Actuators Sensors, Actuators, Communications • Statistical Learning for Process Regulation and Monitoring • Dynamic Service Composition 57 PRIMA Perception Recognition and Integration for Observing and Modeling Activity James L. Crowley, Prof. I.N.P. Grenoble Augustin Lux, Prof. I.N.P. Grenoble Patrick Reignier, MdC. Univ. Joseph Fourier Dominique Vaufreydaz, MdC UPMF 58