High-Performance Distributed Multimedia Computing Frank Seinstra, Jan-Mark Geusebroek

advertisement
High-Performance Distributed Multimedia
Computing
Frank Seinstra, Jan-Mark Geusebroek
Intelligent Systems Lab Amsterdam
Informatics Institute
University of Amsterdam
MultimediaN (BSIK Project)
MultimediaN and DAS-3
MultimediaN and highperformance computing
Van Essen et al.
Science 255,
1999.
A Real Problem, part 1…

News Broadcast - September 21, 2005 (see video1.wmv)
automatic
analysis?


Police investigating over 80.000 (!) CCTV recordings
First match found no earlier than 2.5 months after
July 7 attacks
Image/Video Content Analysis

Lots of research + benchmark evaluations:
– PASCAL-VOC (10,000+ images), TRECVID (200+ hours of video)

A Problem of scale:
– At least 30-50 hours of processing time per hour of video!



Beeld&Geluid => 20.000 hours of TV broadcasts per year
NASA => over 850 Gb of hyper-spectral image data per day
London Underground => over 120.000 years of processing … !!!
High Performance Computing

Solution:
– Very, very large scale parallel and distributed computing

New Problem:
– Very, very complicated software
Since
1998:
Solution:
tool to make parallel &
“Parallel-Horus”
distributed computing
transparent to user
User
- familiar programming
- easy execution
Wide-Area
Beowulf-type
Grid Clusters
Systems
Parallel-Horus: Features (1)

Sequential programming:
Sequential API
Parallel-Horus
Parallelizable Patterns
+/- 18 patterns (MPI)
Seinstra et al., Parallel Computing, 28(7-8):967-993, August 2002
Parallel-Horus: Features (2)

Lazy Parallelization:
Don’t do this:
Scatter
ImageOp
Scatter
ImageOp
Gather
Scatter
ImageOp
Gather
ImageOp
Gather
Do this:
Avoid Communication
Seinstra et al., IEEE Trans. Par. Dist. Syst., 15(10):865-877, October 2004
Extensions for Distributed Computing

Wide-Area Multimedia Services:
Parallel
Horus
Client
Parallel
Parallel
Parallel
Horus
Horus
Horus
Server
Servers
Servers
Parallel



User transparency?
Abstractions & techniques?
Grid connectivity problems?
Horus
Client
Color-Based Object Recognition (1)
+

=
Our Solution:
– Place ‘retina’ over input image
– Each of 37 ‘retinal areas’ serves as a ‘receptive field’
– For each receptive field:


Obtain set of local histograms, invariant to shading / lighting
Estimate Weibull parameters ß and γ for each histogram
– Hence: scene description by set of 37x4x3 = 444 parameters
Geusebroek, British Machine Vision Conference, 2006.
Color-Based Object Recognition (2)

Learning phase:
– Set of 444 parameters is stored in database
– So: learning from 1 example, under single
visual setting
“a hedgehog”

Recognition phase:
– Validation by showing objects under at least 50 different
conditions:



Lighting direction
Lighting color
Viewing position
Amsterdam Library of Object Images (ALOI)

In laboratory setting:


300 objects correctly recognized under all (!) visual conditions
700 remaining objects ‘missed’ under extreme conditions only
Geusebroek et al., Int. J. Comput. Vis.. 61(1):103-112, January 2005
Example: Object Recognition
See also: http://www.science.uva.nl/~fjseins/aibo.html
Example: Object Recognition
(see video2.wmv)
Demonstrated live (a.o.) at ECCV 2006, June 8-11, 2006, Graz, Austria
Performance / Speedup on DAS-2
96
64
56
80
48
64
linear
32
client
24
Speedup
Speedup
40
linear
48
client
32
16
16
8
0
0
0
8
16
24
32
40
48
56
64
Nr. of CPUs
Single cluster, client side speedup



0
16
32
48
64
80
96
Nr. of CPUs
Four clusters, client side speedup
Recognition on single machine: +/- 30 seconds
Using multiple clusters: up to 10 frames per second
Insightful: even ‘distant’ clusters can be used
effectively for close to ‘real-time’ recognition
Current & Future Work

Very Large-Scale Distributed Multimedia Computing:
– Overcome practical annoyances:

Software portability, firewall circumvention, authentication, …
– Optimization and efficiency:


Tolerant to dynamic Grid circumstances, …
Systematic integration of MM-domain-specific knowledge, …
– Deal with non-trivial communication patterns:

Heavy intra- & inter-cluster communication, …
– Reach the end users:


Programming models, execution scenarios, …
Collaboration with VU
(Prof. Henri Bal)
& GridLab
– Ibis: www.cs.vu.nl/ibis/
– Grid Application Toolkit: www.gridlab.org
Conclusions




Effective integration of results from two largely
distinct research fields
Ease of programming => quick solutions
With DAS-3 / StarPlane we can start to take on much
more complicated problems
But most of all:
– DAS-3 very significant for future MM research
The End
(see video3.avi)
Download