Real-time Processing of Media Streams: A Case for Event-based Interaction

advertisement
DEBS’02
Viktor S. Wold Eide , Frank Eliassen ,
Olav Lysne , and Ole-Christoffer Granmo
Real-time Processing of Media Streams:
A Case for Event-based Interaction
Department of Informatics , Oslo, Norway
viktore,olegr @ifi.uio.no
Simula Research Laboratory , Lysaker, Norway
viktore,frank,olegr,olavly @simula.no
http://www.ifi.uio.no/˜dmj/
International Workshop on
Distributed Event-based Systems (DEBS’02)
July 2nd and 3rd, 2002, Vienna, Austria
1 of 20
DEBS’02
Overall Project Goal
The purpose of the content analysis is to index and annotate media streams.
Address and devise solutions for an extensible framework for real-time content
analysis of media streams transported over a network.
The purpose of the framework is to simplify development for this application
domain, by handling general issues.
2 of 20
Streaming
Control
Sensors
DEBS’02
Typical Environment for the Application Domain
Network
Processing
3 of 20
Streaming
Control
DEBS’02
Case: Real-time Tracking of Object in Video
Network
Sensors
Processing
4 of 20
DEBS’02
Massive amounts of data, even for a single compressed media stream, e.g.
video 1Mbps - 10Mbps
Concurrent analysis of a number of different media streams increase resource
requirements even further
Some application sub domains have an inherent distributed nature, e.g.
office/traffic surveillance
Todays “best effort” environments (OS, network) can not give any guarantees
with respect to the availability of resources
Real time requirements limit the time available for processing each media
40 ms/frame
sample - 25 frames/second video
Challenges
5 of 20
DEBS’02
Challenges: Feature Extraction
Calculation of quantitative features, such as motion vectors, color histograms,
and texture coarseness, may in general be arbitrarily complex and
computationally demanding.
Object tracking case: Motion vector estimation, block based.
6 of 20
DEBS’02
Challenges: Classification
Interpretation of extracted features, in general is a very hard problem.
Feasibility and reliability necessitates restriction of interpretation to an
application specific context.
Object tracking case: Determine when an object enters the camera view, and then
track the center position of the moving object.
7 of 20
DEBS’02
Support distributed processing
Support parallel processing of compute intensive algorithms
Support concurrent processing of a number of media streams
Support adaptation, reconfiguration, and migration
Simplify integration of new sub-technologies, e.g. a new video analysis
algorithm should be pluggable into framework when available
Framework Goals
Develop domain specific train-able classifiers to bridge the gap between
low-level features (e.g. motion vectors) and high-level
interpretation/annotations (e.g. moving object)
8 of 20
DEBS’02
Content Analysis Hierarchy: Example
Classification
C
C
feature
Extraction
E
E
E
F
F
F
Filtering
Streaming
S
S
C
E
F
S
: Classification
: feature Extraction
: Filtering
: Streaming
: Extracted Features
: Filtered media stream
: Media stream
9 of 20
DEBS’02
Content Analysis Hierarchy: Tracking of Object
Tracked
Position=(3,3)
Classification
OT
1
2
3
4
feature
Extraction
ME
Filtering
CF
Streaming
VS
1 2 3 4
OT : Object Tracking
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
: Filtered media stream
: Media stream
10 of 20
DEBS’02
Content Analysis Hierarchy: Tracking of Object
Tracked
Position=(3,3)
CO
Classification
PF
PF
1
2
3
4
1 2
3 4
1
2
3
4
feature
Extraction ME
Filtering
Streaming
Parallel processing
at different levels
ME
CF
CF
VS
CO : Coordination
PF : Particle Filtering
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
: Filtered media stream
: Video Stream
11 of 20
DEBS’02
Interaction Model Requirements
allow one-one, one-many, many-one, and many-many communication
provide a level of indirection to simplify parallelization
The interaction model should:
provide a level of indirection to simplify adaptation, reconfiguration, and
migration
12 of 20
DEBS’02
A Case for Event-based Interaction
These requirements fit the publish/subscribe interaction paradigm very well,
leading to an event based model.
balance requirements for real time communication and low event propagation
delay against ordering and reliability guarantees associated with event delivery
be realized as a distributed service to improve scalability and reliability
support different degrees of distribution - intra process, intra host, and inter
host (both LAN and WAN)
The event notification service should:
take advantage of network level multicast to improve scalability
We handle ordering and synchronization at our framework level, by assuming
global time in hosts (Network Time Protocol, IETF).
13 of 20
DEBS’02
Event-based Interaction: Object Tracking
CF
VS
2
ME
ENS 3
1
PF
4
5
2
CF
ME
PF
CO
CO : Coordination
PF : Particle Filter
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
: Filtered media stream
: Video Stream
14 of 20
Streaming
! ! ! ! !
! ! ! ! !
Control
" " " "
#" #" #" "
# # #
DEBS’02
Deployment Example: Object Tracking
CO
Network
Sensors
PF
VS
ME
CF
PF
ME
CF
Processing
15 of 20
DEBS’02
Prototype, Event Notification Service
A thin layer on top of UDP and IP multicast
Filtering in Mbus layer of all components - “multicast and filter”
Filtering based on address, a sequence of name,value tuples
Provides a layer of indirection
Inherits many characteristics from IP multicast (scalability, delay, reliability,
and ordering)
Implemented as a distributed service, in network and end nodes
Based on Mbus, work in progress in IETF. Mbus has the following
characteristics:
Easy to integrate, few lines of code, text based protocol
16 of 20
DEBS’02
Used standard PC’s connected by 100 Mbps switched Ethernet LAN
The protocol stack for media streaming was MJPEG/RTP/UDP/IP
Empirical Results: Scalability Test, Object Tracking
Video size of 352 x 288 pixels, block size of 16 x 16 pixels, 320 blocks
1 CPU 2 CPUs 4 CPUs 8 CPUs 10 CPUs
Streaming
2.5
5
10
20
25
Object Tracking 2.5
5
8.5
13.5
16
Efficiency
100% 100%
85% 67.5%
64%
The number of frames/second processed by different configurations of the object
tracking application, compared to the streamed (ideal) frame rate.
Observation: When streaming at 25 f/s, depacketization and JPEG to RGB
transformation consumes roughly 30% of the processing power of a single CPU.
The entire video frame is processed, not only the necessary blocks.
17 of 20
DEBS’02
A distributed approach is required to handle the real-time requirements, the
massive amounts of data, and the computational complexity
A distributed solution is more appropriate for problem domains having an
inherent distributed nature
An event notification service it the glue that binds components together and
provides a level of indirection
The event notification service simplifies parallelization, adaptation,
reconfiguration, and migration
The scalability of a real-time motion vector based object tracking application,
implemented in the framework, has been demonstrated experimentally.
Conclusion
Mbus performs well for our experiments, both as intra and inter host event
notification service on a LAN.
18 of 20
DEBS’02
Further Work
Will an event notification service capable of video streaming improve scalability?
Object tracking: Each motion estimation component may then subscribe to only
a region, some blocks, of each video frame.
ME
CF
PF
Event Notification Service
VS
1
CF
2
3
ME
4
5
PF
CO
CO : Coordination
PF : Particle Filter
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
19 of 20
DEBS’02
Add parallel block based color and texture feature extractors
Map identity to objects during classification, based on color and texture
Add a number of video streams and relate classified content, e.g. track objects
across media streams and time, based on assigned identity
Further Work
Demand driven execution, e.g. process edge blocks of video for object
detection, and then all blocks for tracking
20 of 20
Download