Supporting Timeliness and Accuracy in Distributed Real-time Content-based Video Analysis

advertisement
ACM Multimedia 2003
Supporting Timeliness and Accuracy in
Distributed Real-time Content-based Video Analysis
Viktor S. Wold Eide1,2, Frank Eliassen2,
Ole-Christoffer Granmo1,2,3, and Olav Lysne2†
University of Oslo , Norway
{viktore,olegr}@ifi.uio.no
2
Simula Research Laboratory , Lysaker, Norway
{viktore,frank,olegr,olavly}@simula.no
3
Agder University College, Norway
{ole.granmo}@hia.no
1
http://www.ifi.uio.no/˜dmj/
ACM Multimedia 2003
November 2-8, 2003, Berkeley, CA, USA
†
Authors are listed alphabetically
1 of 16
ACM Multimedia 2003
Control
Introduction
Processing
Sensors
Streaming
Network
• Due to the increasing availability of inexpensive cameras and deployment of
high-speed computer networks, it has become economically and technically
feasible to build complex distributed real-time video content analysis
applications
2 of 16
ACM Multimedia 2003
Introduction
• The purpose of real-time content analysis applications is to index and annotate
captured media streams on-line, as events happen
— E.g., to detect and track a running person on-line
• Some examples of applications in this domain are:
— traffic surveillance
— indexing of TV Broadcast news
• This application domain has many common issues which may be handled
generally
3 of 16
ACM Multimedia 2003
Challenges
QoS Requirements
Application
COL 1 2 3 4 5 6 7 8
100
1 2 3 6 12 25 50 80
10
Physical
Resources
Ether 10/100
! Power
• The video data must be analyzed:
— at least as fast as the data is made available to the application
— with an acceptable error rate
• Such Quality of Service requirements are typically mutually dependent
• The tasks of the application must be mapped to the physical resources so that
the QoS requirements are satisfied during execution
4 of 16
ACM Multimedia 2003
Contribution
• An architecture for distributed real-time content-based video analysis that
supports
— an explicit QoS model for this class of applications
— balancing of QoS properties against the available processing resources
— scalability at multiple logical levels of distribution
5 of 16
ACM Multimedia 2003
Content Analysis QoS Model
• Accuracy — maximum ratio of misclassifications to number of classifications
• Temporal Resolution — minimum temporal length of detectable events
• Latency — maximum application response time
6 of 16
ACM Multimedia 2003
Content Analysis Application Model
C
Classification
C
feature
Extraction
E
E
E
Filtering
F
F
F
Streaming
S
S
C
E
F
S
: Classification
: feature Extraction
: Filtering
: Streaming
: Extracted Features
: Filtered media stream
: Media stream
• A typical content analysis application can be seen as a graph, where nodes
represent tasks and edges represent directed flows of data
• Different classes of functionality can be found at the four logical levels of the
task graph: streaming, filtering, feature extraction, and classification
7 of 16
ACM Multimedia 2003
Architecture Requirements
• In order to provide some level of control of the QoS provided, the video
content analysis application must be scalable and resource aware
• A scalable architecture can generally only be obtained by adopting
distribution as its basic principle
— e.g., by parallelizing and distributing application algorithms
• In video content analysis, the relative complexity of streaming, filtering,
feature extraction, and classification depends on the application
⇓
• The architecture should support parallelization and focusing of processing
resources on any given logical levels, independently of other logical levels
• Such parallelization and distribution requires a scalable interaction mechanism
• Algorithms that can be used to decide whether a QoS requirement can be
satisfied in a given processing environment are needed
8 of 16
ACM Multimedia 2003
Overall Architecture
E
F
E
E
E
S
Application
Candidate
Configurations
QoS Requirements
C
C
C
− accuracy
− temporal resolution
− latency
ARCAMIDE
Config 1
Config 2
links
Resource
model
CPU
CPU
CPU
Ether 10/100
COL 1 2 3 4 5 6 7 8
100
1 2 3 6 12 25 50 80
10
! Power
Physical
Resources
F
9 of 16
ACM Multimedia 2003
Example Application Task Graph
Tracked
Position=(3,3)
CO
Classification
PF
PF
1
2
3
4
1 2
1
2
3
4
feature
Extraction ME
Filtering
Streaming
Parallel processing
at different levels
3 4
ME
CF
CF
VS
CO : Coordination
PF : Particle Filtering
ME: Motion Estimation
CF : Color Filtering
VS : Video Streaming
: Event Notification
: Filtered media stream
: Video Stream
10 of 16
ACM Multimedia 2003
Generating Candidate Configurations
1 ms CO
10 ms PF
10 ms HC
10 ms ME
10 ms CF
1 ms
1 ms
1 ms
1 ms
1 ms
1 ms
1 ms
1 ms
1 ms
PF 10 ms
1 ms CO
TC 10 ms
10 ms PF
ME 10 ms
10 ms ME
CF 10 ms
10 ms CF
1 ms
1 ms
PF 10 ms
ME 10 ms
1 ms
1 ms
30 ms VS
Error rate: 0.03 Latency: 74 ms Resolution: 43 ms
1 ms
CF 10 ms
30 ms VS
Error rate: 0.05 Latency: 64 ms Resolution: 33 ms
• ARCAMIDE prunes tasks from “brute force” task graph iteratively until either
— accuracy falls below required level −→ failure
— latency/resolution requirements are met −→ success
• Pruning is guided by task efficiency: accuracy loss/processing cost
11 of 16
ACM Multimedia 2003
Deployment and Execution
CO PF ME CF
VS
CF ME PF
ENS
OS
HW
ENS
OS
HW
ENS
OS
HW
COL 1 2 3 4 5 6 7 8
100
1 2 3 6 12 25 50 80
10
Physical
Resources
Ether 10/100
! Power
CO : Coordination
PF : Particle Filter
ME : Motion Estimation
CF : Color Filtering
VS : Video Streaming
ENS : Event Notification
Service
OS : Operating System
HW : Hardware
• The deployed components communicate through a high-performance
distributed event notification service
— Simplifies configuration and reconfiguration
— Supports independent parallelization at different logical levels
12 of 16
ACM Multimedia 2003
Empirical Results: Balancing Accuracy Against
Timeliness
60
0,6
50
0,5
40
0,4
30
0,3
20
0,2
10
0,1
0
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
Processing Time
Error Rate
• The efficiency-based ARCAMIDE pruning strategy produces a fine grained
range of error rate/processing time tradeoffs
13 of 16
ACM Multimedia 2003
Empirical Results: Scalability
The number of frames / second processed by different deployments of a fixed
object tracking task graph, compared to the ideal frame rate:
1 CPU 2 CPUs 4 CPUs 8 CPUs 10 CPUs
Ideal Frame Rate
2.5
5
10
20
25
Streaming
2.5
5
10
20
25
Filtering and Feature Extraction 2.5
5
8.5
13.5
16
Classification
2.5
5
10
20
25
14 of 16
ACM Multimedia 2003
Conclusion
We have presented a general architecture for distributed real-time video content
analysis applications, which given:
• the application graph (components and data flow),
• the application QoS requirements (accuracy and timeliness), and
• the available physical resources (expressed in the resource model)
supports QoS aware mapping of application onto physical resources
Salient features of the architecture include:
• independent scalability at multiple logical levels of distribution
— handle harder QoS requirements by utilizing additional resources
— decouple application development from QoS mapping and deployment
— realized by using an event notification service
15 of 16
ACM Multimedia 2003
Further Work
• Development of a more complete QoS management architecture for real-time
video content analysis applications
• The work presented here represents steps towards that goal
16 of 16
Download