Scalable Independent Multi-level Distribution in Multimedia Content Analysis Viktor S. Wold Eide

advertisement
IDMS−PROMS 2002
Scalable Independent Multi-level Distribution in
Multimedia Content Analysis
Viktor S. Wold Eide , Frank Eliassen ,
Ole-Christoffer Granmo , and Olav Lysne
Department of Informatics , Oslo, Norway
viktore,olegr @ifi.uio.no
Simula Research Laboratory , Lysaker, Norway
viktore,frank,olegr,olavly @simula.no
http://www.ifi.uio.no/˜dmj/
Joint International Workshop on
Interactive Distributed Multimedia Systems /
Protocols for Multimedia Systems
IDMS-PROMS 2002
November 26-29, 2002, Coimbra, Portugal
Authors are listed alphabetically
1 of 17
IDMS−PROMS 2002
Introduction
Content analysis, in general and an object tracking application
Scalability challenges
Component interaction and communication
Feature extraction
Classification
Empirical results, scalability test of the object tracking application
Outline
Conclusion and further work
2 of 17
IDMS−PROMS 2002
Our application domain is automatic real-time content analysis
The purpose of the content analysis is to index and annotate media streams
Introduction
Some examples of applications in this domain are:
— traffic surveillance,
— indexing of TV Broadcast news, and
— object tracking, the application case in this paper
This application domain has many common issues which may be handled
generally
Our overall project goal is to: “Address and devise solutions for an extensible
framework for real-time content analysis of media streams transported over a
network”
3 of 17
IDMS−PROMS 2002
Content Analysis Hierarchy
In general, content analysis applications consist of several levels
Classification
feature
Extraction
Filtering
Streaming
4 of 17
IDMS−PROMS 2002
Content Analysis Hierarchy: Object Tracking
The functional decomposition of the object tracking application
Tracked
Position=(3,3)
Classification
OT
1
2
3
4
feature
Extraction
ME
Filtering
CF
Streaming
VS
1 2 3 4
OT : Object Tracking
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
: Filtered media stream
: Media stream
5 of 17
IDMS−PROMS 2002
Challenges
The processing resource requirements for multimedia content analysis are
very challenging, and will most likely remain so in the near future
A scalable solution requires parallel and distributed processing on multiple
CPUs
In multimedia content analysis applications, parallelization and distribution
are difficult tasks
The relative computational complexity of streaming, filtering/transformation,
feature extraction, and classification may vary
A processing bottleneck at any level may render the application useless,
unless the processing bottleneck can be resolved
6 of 17
IDMS−PROMS 2002
Content Analysis Hierarchy: Object Tracking
A configuration where only the classification level is parallelized
CO
Classification
PF
1
2
3
4
feature
Extraction
1 2 3 4
ME
Filtering
CF
Streaming
VS
Tracked
Position=(3,3)
PF
CO : Coordination
PF : Particle Filtering
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
: Filtered media stream
: Video Stream
7 of 17
IDMS−PROMS 2002
Content Analysis Hierarchy: Object Tracking
A configuration where several levels are parallelized
Tracked
Position=(3,3)
CO
Classification
PF
PF
1
2
3
4
1 2
1
2
3
4
feature
Extraction ME
Filtering
Streaming
3 4
ME
CF
CF
VS
CO : Coordination
PF : Particle Filtering
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
: Filtered media stream
: Video Stream
8 of 17
IDMS−PROMS 2002
PF components subscribe to events from the event notification service, ENS:
src=vs1 func=me
ME components publish motion vectors for blocks as event notifications:
src=vs1 func=me time=[t, t] block=[1,1] vector=[ 0 ,0] ...
src=vs1 func=me time=[t, t] block=[3,2] vector=[-1 ,0] ...
src=vs1 func=me time=[t, t] block=[4,4] vector=[ 0 ,0]
Event-based Interaction: Object Tracking
CF
VS
2
ENS
1
CF
2
ME
3
ME
PF
4
5
PF
CO
CO : Coordination
PF : Particle Filter
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
: Filtered media stream
: Video Stream
9 of 17
IDMS−PROMS 2002
A scalable solution
parallelization
Even simple feature extraction algorithms are costly when applied to a
real-time high quality video stream. Additionally, feature extraction
algorithms may be arbitrarily complex
Feature Extraction
partitioning of the media data
Our framework allows spatial partitioning, by using a block based approach.
Each feature extraction component processes only some blocks
10 of 17
IDMS−PROMS 2002
Classification
The classification level may become a processing bottleneck due to:
— the complexity of the content analysis task
— the required classification rate
— the required classification accuracy
Accordingly, a scalable solution requires parallel and distributed classification
Classifier
Texture
Image: n−2
Motion Vector
Image: n−1
Color
Image: n
11 of 17
IDMS−PROMS 2002
Our PF maintains histories
—
of high-level concepts (e.g. object positions)
is an assignment of high-level concepts to past video frames
— is the likelihood of , given the extracted features
Classification: The Particle Filter
Alternative histories are maintained to handle noise and uncertainty
Image: n−2
Image: n−1
Image: n
12 of 17
IDMS−PROMS 2002
Classification: A Parallel Particle Filter
We propose a parallel PF for resolving classification processing bottlenecks
Our parallel PF consists of multiple parallel PF components and a single
light-weight coordinator component
Each PF component maintains local histories of high-level concepts
The PF components cooperate by exchanging event notifications to
synchronize histories
The coordinator makes globally consistent classifications based on the local
histories of the PF components
Image: n−2
Image: n−1
Image: n
13 of 17
IDMS−PROMS 2002
Used standard PCs connected by 100 Mbps switched Ethernet LAN
The protocol stack for media streaming was MJPEG/RTP/UDP/IP multicast
Empirical Results: Scalability Test, Object Tracking
Video size of 352 x 288 pixels, block size of 16 x 16 pixels, 320 blocks
The number of frames / second processed by different configurations of the
object tracking application, compared to the ideal frame rate:
1 CPU 2 CPUs 4 CPUs 8 CPUs 10 CPUs
Ideal Frame Rate
2.5
5
10
20
25
Streaming
2.5
5
10
20
25
Filtering and Feature Extraction 2.5
5
8.5
13.5
16
Classification
2.5
5
10
20
25
Observation: When streaming at 25 f/s, depacketization and JPEG to RGB
transformation consumes roughly 30% of the processing power of a single CPU.
The entire video frame is processed, not only the necessary blocks
14 of 17
IDMS−PROMS 2002
Conclusion
Event-based interaction simplifies parallelization
— Provides a level of indirection, location transparency, etc.
Each level of the content analysis task may be independently parallelized
— Allows for focusing the processing resources on the processing bottlenecks
The parallel particle filter is well suited for real-time classification
— Allows distributed processing at the classification level
— Is robust, i.e. able to supress noise in extracted features
The scalability of a real-time motion vector based object tracking application,
implemented in the framework, has been demonstrated experimentally
15 of 17
IDMS−PROMS 2002
Assign identity to objects during classification, based on color and texture
Add parallel block based color and texture feature extractors
Further Work
Add a number of video streams and relate classified content, e.g. track objects
across media streams and time, based on assigned identity
Add demand driven feature extraction - the features are ranked on-line
according to their ability to contribute to the current stage of the content
analysis task
— E.g. the edge blocks are processed for object detection, and the blocks
surrounding objects are processed for tracking purposes
16 of 17
IDMS−PROMS 2002
Further Work
We are currently working on an event notification service for high data rates
Object tracking case: Each motion estimation component may subscribe to
only some blocks of each video frame
CF
VS
1
ME
2
3
PF
4
5
ENS
CF
ME
PF
CO
CO : Coordination
PF : Particle Filter
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
17 of 17
Download