PRIMA Perception Recognition and Integration for Observing

advertisement
PRIMA
Perception Recognition and Integration
for Observing and Modeling Activity
James L. Crowley, Prof. I.N.P. Grenoble
Augustin Lux, Prof. I.N.P. Grenoble
Patrick Reignier, MdC. Univ. Joseph Fourier
Dominique Vaufreydaz, MdC UPMF
1
The PRIMA Group Leaders
Doms, Jim, Patrick and Augustin
2
The PRIMA Group Members
Trombinoscope
3
The PRIMA Group, May 2006
Permanents :
James L. Crowley, Prof. I.N.P. Grenoble
Augustin Lux, Prof. I.N.P. Grenoble
Patrick Reignier, MdC. U.J.F.
Dominique Vaufreydaz, MdC. UPMF.
Assistante :
Caroline Ouari (INPG)
Contractual Engineers
Alba Ferrer, IE INRIA
Mathieu Langet, IE INPG
4
The PRIMA Group, May 2006
Doctoral Students :
Stan Borkowski (Bourse EGIDE)
Chunwiphat, Suphot (Bourse Thailand)
Thi-Thanh-Hai Tran (Bourse EGIDE)
Matthieu Anne (Bourse CIFRE - France Telecom)
Olivier Bertrand (Bourse ENS Cachan)
Nicolas Gourier (Bourse INRIA)
Julien Letessier (Bourse INRIA)
Sonia Zaidenberg (Bourse CNRS - BDI)
Oliver Brdiczka (Bourse INRIA)
Remi Emonet (Bourse MENSR)
5
Plan for the Review
1) Presentation of Scientific Project
Objectives
Research Problems and Results
Bilan 2003 - 2006
Evolutions for 2007-2010
6
Objective of Project PRIMA
Develop the scientific and technological
foundations for context aware, interactive
environments
Interactive Environment:
An environment capable of perceiving, acting,
communicating, and interacting with users.
7
Experimental Platforme : FAME
Augmented Meeting Environment
8 Cameras
7 Steerable
1 fixed, wide angle
8 Microphones
(acoustic Sensors)
6 Biprocessors (3 Ghz)
3 Video Interaction Devices
(Camera-projector pairs)
January 06: Inauguration of new Smart Environments Lab (J 104)
8
Augmented Meeting Environment
QuickTime™ and a
decompressor
are needed to see this picture.
9
Research Problems
• Context-aware interactive environments
• New forms of man-machine interaction (using
perception)
• Real Time, View Invariant, Computer Vision
• Autonomic Architectures for Multi-Modal Perception
10
Research Problems
• Context-aware interactive environments
• New forms of man-machine interaction (using
perception)
• Real Time, View Invariant, Computer Vision
• Autonomic Architectures for Multi-Modal Perception
11
Software Architecture for Observing Activity
User Services
Situation Modeling
Perceptual Components
Logical Sensors, Logical Actuators
Sensors, Actuators, Communications
Sensors and actuators: Interface to the physical world.
Perception and action: Perceives entities, Assigns entities to roles.
Situation: Filter events, Describes relevant actors and props for
services.
12
(User) Services: Implicit or explicit. Event driven.
Situation Graph
Situation Graph
Situation-1
Situation-3
Situation-5
Situation-6
Situation-2
Situation-4
Situation:
An configuration of entities playing roles
Configuration: Set of Relations (Predicates) over entities.
Entity: Actors or Objects
Roles: Abstract descriptions of Persons or objects
A situation graph describes a state space of situations
and the actions of the system for each situation
13
Situation and Context
Basic Concepts:
Property:
Entity:
Composite entity:
Relation:
Actor:
Role:
Any value observed by a process
A “correlated” set of properties
A composition of entities
A predicate defined over entities
An entity that can act.
Interpretation assigned to an entity or actor
Situation:
A configuration of roles and relations.
14
Situation and Context
Role:
Relation:
Situation:
Interpretation assigned to an entity or actor
A predicate over entities and actors
An configuration of roles and relations.
A situation graph describes the state space of situations
and the actions of the system for each situation
Approach:
Compile a federation of processes to observe the
roles (actors and entities) and relations that define
situations.
15
Acquiring Situation Models
Objective:
Automatic acquisition of situation models.
Approach:
Start with simple sterotypical model for scenario
Develop using Supervised Incremental Learning
Recognition:
Detect Roles with Linear Classifiers
Recognize Situation using probablisitic model
16
Video Acquisition System V2.0
Process Supervisor
Situation Modeling
Event Bus
Face
Detection
Audience
Camera
Streaming Video
Camera
Audio-Visual
Composition
MPEG
Speaker
Tracker
Vocal
Activity
Detector
M
i
c
M
i
c
New Slide
Detection
New
Person
Detection
Camera
Steerable
Camera 1
Projector
Wide Angle
Camera
17
Audio-Visual Acquisition System
QuickTime™ and a
decompressor
are needed to see this picture.
Version 1.0 - January 2004
18
Research Problems
• Context-aware interactive environments
• New forms of man-machine interaction (using
perception)
• Real Time, View Invariant, Computer Vision
• Autonomic Architectures for Multi-Modal Perception
19
Steerable Camera Projector Pair
20
QuickTime™ and a
decompressor
are needed to see this picture.
21
Portable Display Surface
QuickTime™ and a
decompressor
are needed to see this picture.
22
Rectification by Homography
(x, y)
(x', y')
wx h11
wy h21
  
w  h31
wx
x
w
h12
h22
h32
h13 x
y
h 23 
 
h 33 1 
wy
y
w
For each rectified pixel (x,y), project to original pixel and
compute interpolated intensity
23
Real Time Rectification for the PDS
24
Luminance-based button widget
S. Borkowski, J. Letessier, and J. L. Crowley. Spatial Control of Interactive Surfaces in an Augmented Environment.
25
In Proceedings of the EHCI’04. Springer, 2004.
Striplet – the occlusion detector
x
y
R (t ) 
 f
 f
gain
gain
( x , y )  L( x , y , t ) dxdy
( x , y ) dxdy  0
Gain
x
26
Striplet – the occlusion detector
x
R 0
y
27
Striplet – the occlusion detector
x
R 0
y
28
Striplet-based SPOD
SPOD – Simple-Pattern Occlusion Detector
29
Projected Calculator
QuickTime™ and a
YUV420 codec decompressor
are needed to see this picture.
30
Research Problems
• Context-aware interactive environments
• New forms of man-machine interaction (using
perception)
• Real Time, View Invariant Computer Vision
• Autonomic Architectures for Multi-Modal Perception
31
Chromatic Gaussian Basis
GxL 
GC1 
GC2 
 0   C1 
Gx
    C2 
Gx


 k  GL 
 xx 
L
G
 xy 
L

G
 yy 

Normalized in Scale and Orientation to Local Neighborhood
32
Real Time, View Invariant Computer Vision
Results
• Scale and orientation normalised Receptive Fields
computed at video rate. (BrandDetect system, IST
CAVIAR)
• Real time indexing and recognition (Thesis F. Pelisson)
• Robust Visual Features for Face Detection
(Thesis N. Gourier)
• Direct Computation of Time to Crash
(Masters A. Negre)
• Natural Interest "Ridges"
33
Scale and Orientation Normalised Gaussian
RF's
Intrinisic Scale: Peak in Laplacian as a function of Scale.
 i (i, j)  Arg  Max{  2 G()  A(i, j) }

Oriented Response can be obtained as a weighted sum of cardinal derivatives
<A(i,j) G()> = <A(i,j) Gx()> Cos() + <A(i,j) Gy() > Sin()
Normalisation of scale and orientation provides invariance to distance and
camera rotation.
34
Natural Interest Points
(Scale Invariant "Salient" image features)
Local extrema of < 2G(i,j,)•A(i,j)>
over i, j, 
Problems with Points
• Elongated shapes
• Lack of discrimination power
• No orientation information
Proposal: Natural Interest Ridges
Maximal ridges in Laplacian Scale Space:
35
Natural Ridge Detection [Tran04]
2
2

f

f
2
 f  2  2
x
y
Laplacian
Hessian
 Compute Derivatives at different Scales.
For each point (x,y,scale)
Compute second derivatives: fxx,fyy,fxy
Compute eigenvalues and eigenvectors of Hessian matrix
Detect local extremum in the direction corresponding to the
largest eigenvalue.
Assemble Ridge points,
36
QuickTime™ and a
decompressor
are needed to see this picture.
37
Real Time, View Invariant Computer Vision
Current activity
• Robust Visual Features for Face Detection
• Direct Computation of Time to Crash
• Natural Interest "Ridges" for perceptual organisation.
38
Research Problems
• Context-aware interactive environments
• New forms of man-machine interaction (using
perception)
• Real Time, View Invariant, Computer Vision
• Autonomic Architectures for Multi-Modal Perception
39
Supervised Perceptual Process
Events
Con figu ration
Requests for state
Events
Cu rrentSt ate
Respons e to comm ands
Autono mic Supe rvisor
Time
Detection
Prediction
ROI,S, DetectionM ethod
Vi deo S tream
ROI,S, DetectionM ethod
Obse rvation Mo
Obse rvation
dules Mo
dules
Obse rvation
Es timation
Ent ities
Intep retat ion
Acto rs
Modul es
Supervisor Provides:
Execution Scheduler
Parameter Regulator
Capabilities
• Command Interpreter
• Description of State and
40
Detection and Tracking of Entities
QuickTime™ and a
decompressor
are needed to see this picture.
Entities: Correlated sets of blobs
Blob Detectors: Backgrnd difference, motion,color, receptive fields
histograms
Entity Grouper: Assigns roles to blobs as body, hands, face or eyes
41
Autonomic Properties
provided by process supervisor
Auto-regulatory: The process controller can adapt parameters to
maintain a desired process state.
Auto-descriptive: The process controller provides descriptions of
the capabilities and the current state of the process.
Auto-critical: Process estimates confidence for all properties and
events.
Self Monitoring: Maintaining a description of process state and
quality of service
42
Self-monitoring Perceptual Process
Video
Process
Model
Error?
Model
Learning
Unknown
Errors
Error
Classification
Error
Recovery
Perceptual
Process
• Process monitors likelihood of output
• When an performance degrades, process adapts
processing (modules, parameters, and data)
43
Autonomic Parameter Regulation
Operator
Video
Stream
Parameter
Regulator
System
Parameters
Pixel-level
Detection
Operator
Input
Tracked
Entities
Entities
Recognition
Training
Entity
Database
Parameter regulation provides robust adaptation to
Changes in operating conditions.
44
Research Contracts (2003-2006)
National and Industrial:
ROBEA HR+ : Human-Robot Interaction (with LAAS and ICP)
ROBEA ParkNav: Perception and action dynamic environments
RNTL ContAct: Context Aware Perception (with XRCE)
Contract HARP (Context aware Services - France Telecom)
IST - FP VI:
Projet IST IP - CHIL : Multi-modal perception for Meeting Services
IST - FP V:
Project IST - CAVIAR: Context Aware Vision for Surveillance
Project IST - FAME: Multi-modal perception for services
Project IST - DETECT : Publicity Detection in Broadcast Video
Project FET - DC GLOSS : Global Smart Spaces
Thematic Network: FGNet (« Face and Gesture »)
Thematic Network: ECVision - Cognitive Vision
45
Collaborations
INRIA Projects
EMOTION (INRIA RA): Vision for Autonomous Robots; ParkNav, ROBEA
(CNRS), Theses of C. Braillon and A. Negre
ORION (Sophia): Cognitive Vision (ECVision), Modeling Human Activity
Academic:
IIHM, Laboratoire CLIPS: Human-Computer Interaction, Smart Spaces;
Mapping Project, IST Projects GLOSS, FAME, Thesis: J. Letissier
Univ. of Karlsruhe (Multimodal interaction): IST FAME and CHIL.
Industry
France Telecom: (Lannion and Meylan) Project HARP, Thesis of M. Anne.
Xerox Research Centre Europe: Project RNTL/Proact Cont'Act
IBM Research (Prague,New York): Situation Modeling, Autonomic Software
Archictures, Projet CHIL
46
Knowledge Dissemination
Journal
Thesis
Conf & Wkshp
1
Patents
6
Chapters
12
10
8
6
4
2
0
2003
2004
2005
2006
47
Conferences and Workshops Organised
General Chairman (or co-chairman)
Conference: SoC-EuSAI 2005
Workshops: Pointing 2004, PETS 2004, Harem 2005
Program Co-Chairman
International Conference on Vision Systems, ICVS 2003,
European Symposium on Ambient Intelligence, EuSAI 2004,
International Conference on Multimodal Interaction, ICMI 2005.
Program Committee/Reviewer: UBICOMP 2003, ScaleSpace 2003, sOc 03, ICIP 03,
ICCV 03 AMFG 04, ICMI 03, RFIA 2004, IAS 2004, ECCV 2004,FG 2004, ICPR
2004, CVPR 2004, ICMI 2004, EUSAI 2004, CVPR 2005, ICRA 2005, IROS 2005,
Interact 2005, ICCV05, ICVS 06, PETS 05, FG06, ECCV06, CVPR06, ICPR06,
IROS06…
48
APP Registered Software
1) CAR : Robust Real-Time Detection and Tracking
APP IDDN.FR.001.350009.000.R.P.2002.0000.00000
Commercial License to BlueEyeVideo
2) BrandDetect: Detection, tracking and recognition of commercial
trademarks in broadcast video
APP IDDN.FR.450046.000.S.P.2003.000.21000
Commercial License to HSArt
3) ImaLab: Vision Software Development Tool.
Shareware, APP under preparation.
Distributed to 11 Research Laboratories in 7 EU Countries
4) Robust Tracker v3.3 (stand-alone)
5) Robust Tracker v3.4 (Autonomic)
6) Apte: Monitoring, regulation and repair of perceptual systems.
7) O3MiCID: Middleware for Intelligent Environments
49
Start-up: Blue Eye Video
PDG:
Marketing :
Engineers :
Councelor :
Incubation:
Creation :
Market:
Sectors:
Status:
Pierre de la Salle
Jean Viscomte
Stephane Richetto
Pierre-Jean Riviere
Fabien Pelisson
Dominique de Mont (HP)
Sebastien Pesnel
James L. Crowley
INRIA Transfer, GRAIN, UJF Industrie. Region Rhône Alpes
Lauréat de concours création d'enterprise
1 June 2003
Observation of human activity
Commercial services, Security, and traffic monitoring
386 K Euros in sales in 2005, >100 Systems installed
50
Blue Eye Video Activity Sensor
(PETS 2002 Data)
QuickTime™ and a
decompressor
are needed to see this picture.
51
46
Blue Eye Video Activity Sensor
(Distributed Sensor Networks)
QuickTime™ and a
YUV420 codec decompressor
are needed to see this picture.
52
Evolutions for 2006-2010
Context-aware interactive environments
• Adaptation and Development of Activity Models
New forms of man-machine interaction
• Affective Interaction
Real Time, View Invariant, Computer Vision
• Embedded View-invariant Visual Perception
Autonomic Architectures for Multi-Modal Perception
• Learning for Monitoring and Regulation
• Dynamic Service Composition
53
Automatic Adaptation and Development
of Models for Human Activity
S ituation
Network
Supervi sor
Feedback
Learning (re)actions
Learning Situations
Splitting Situations
Deleting obsolete Situations
Learning Roles
Adaptation: consistent behaviour across environments
Development: Acquisition of new abilities
54
Affective interaction
Interactive objects that recognize interest and
affect and that learn to perceive and evoke
emotions in humans.
55
Embedded View-invariant Visual Perception
Embedded Real Time View Invariant Vision in phones and
PDA’s (Work with ST MicroSystems)
56
Distributed Autonomic Systems
for Multi-modal Perception
User Services
Situation Modeling
Perceptual Components
Logical Sensors, Logical Actuators
Sensors, Actuators, Communications
• Statistical Learning for Process Regulation and Monitoring
• Dynamic Service Composition
57
PRIMA
Perception Recognition and Integration
for Observing and Modeling Activity
James L. Crowley, Prof. I.N.P. Grenoble
Augustin Lux, Prof. I.N.P. Grenoble
Patrick Reignier, MdC. Univ. Joseph Fourier
Dominique Vaufreydaz, MdC UPMF
58
Download