Real-World Distributed Computing with Ibis Henri Bal

advertisement
Real-World Distributed
Computing with Ibis
Henri Bal
bal@cs.vu.nl
Vrije Universiteit Amsterdam
Outline
●
Distributed systems
●
Distributed applications
●
The Ibis distributed programming system
●
Ibis in the mobile networked world
Distributed Systems: 1980s
●
Networks of Workstations (NOWs)
●
Collections of Workstations (COWs)
●
Processor pools (Amoeba)
●
Condor pools
●
(Beowulf) clusters
Distributed Systems: 1990s
●
Metacomputing (Smarr & Catlett, CACM)
●
Flocking Condor (Epema)
●
DAS (Distributed ASCI Supercomputer)
●
Grid Blueprint (Foster & Kesselman)
●
Desktop grids, SETI@home
DAS-3
Distributed Systems: 2000s
●
●
●
Cloud computing
●
Infrastructure as a service
●
Virtualization
Mobile computing
●
Sensor networks
●
Smart phones
The Networked World
Problem
●
●
How to write high-performance applications for
real-world distributed systems?
How to integrate many different resources?
Our approach
●
Study fundamental underlying problems
●
… hand-in-hand with realistic applications
●
… integrate solutions in one system: Ibis
User
Distributed Systems
Fundamental problems
●
●
●
●
●
Performance – efficiency on wide-area systems
Heterogeneity – different systems & APIs
Malleability – resources come and go
Fault tolerance - crashes
Connectivity – firewalls, NAT, etc.
Applications
●
Scientific applications
●
Imaging (VU Medical Center, AMOLF)
●
Bioinformatics (sequence analysis, cell modeling)
●
Astronomy (data analysis challenge)
●
Multimedia content analysis
●
Games and model checking
●
Semantic web (distributed reasoning)
Awards
Astronomy
DACH 2008 – BS
DACH 2008 - FT
(Cluster/Grid’08)
SCALE 2008
(CCGrid’08)
ISWC 2008
Multimedia
Computing
AAAI-VC
2007
Semantic Web (van Harmelen et al.)
Multimedia content analysis
●
●
●
Automatically extract information from
images & video
Extract feature vectors from images
●
Describe properties (color, shape)
●
Data-parallel task on a cluster
Compute on consecutive images
●
Task-parallelism on a grid
MMCA
‘Most Visionary Research’ award at AAAI 2007,
(Frank Seinstra et al.)
Games and Model Checking
●
Can solve entire Awari game on
wide-area DAS-3
●
●
Distributed model checking has very
similar communication pattern
●
●
Needs 10G private optical network
(StarPlane)
Search huge state spaces, random work
distribution, bulk asynchronous transfers
Can efficiently run DeVinE model
checker on wide-area DAS-3, use up
to 1 TB memory
●
See IPDPS’09 (May 2009)
Distributed reasoning
●
MaRVIN (Frank van Harmelen et al, VU):
●
●
●
●
a distributed platform for massive RDF
inferencing (deductive closure)
``a brain the size of a planet’’
Uses Ibis to run on heterogeneous systems
(clusters, desktop grids)
3rd prize at Billion Triple track
of Semantic Web Challenge 2008
European users
●
D-Grid: Workflow engine for astronomy
●
U. Erlangen: grid file system
●
INRIA: ProActive on Ibis RMI
●
U. Patras: Jylab scientific computing
●
UPC Barcelona: Grid Superscalar
●
HITACHI: Peta-scale data management
Grid’5000
Outline
●
Distributed systems
●
Distributed applications
●
The Ibis distributed programming system
●
Ibis in the mobile networked world
Ibis Philosophy
●
Real-world distributed applications should be
developed and compiled on a local workstation,
and simply be launched from there
Ibis Approach
●
Virtual Machines (Java) deal with heterogeneity
●
Provide range of programming abstractions
●
Designed for dynamic/faulty environments
●
●
Easy deployment through middlewareindependent programming interfaces
Modular and flexible: can replace Ibis
components by external ones
Ibis Design
●
Functionality from
programming languages
●
●
High-Performance Application
Programming System
Functionality from
operating systems
●
Distributed Application
Deployment System
Ibis System
Programming system
●
●
Programming models:
●
Message passing (RMI, MPJ)
●
Divide-and-conquer (Satin)
●
Jorus: (multimedia applications)
●
Dataflow framework (Maestro)  see HPDC09 Sat 14.30
IPL (Ibis Portability Layer)
●
Java-centric “run-anywhere” library
●
Point-to-point, multicast, streaming, ….
●
Simple model (Join-Elect-Leave ) for tracking resources,
supports malleability & fault-tolerance
SmartSockets library
●
Detects connectivity problems
●
Tries to solve them automatically
●
●
Integrates existing and several new solutions
●
●
With as little help from the user as possible
Reverse connection setup, STUN, TCP splicing,
SSH tunneling, smart addressing, etc.
Uses network of hubs as a side channel
Example
Example
Deployment system
●
IbisDeploy GUI
●
JavaGAT:
●
●
Java Grid Application Toolkit
●
Make applications independent of underlying middleware
Zorilla P2P system
●
Job management, gossiping,
clustering, flood scheduling
JavaGAT
File.copy(...)
cp
ftp
gridftp
scp
http
●
Grid Application
submitJob(...)
fork
pbs
condor
unicore
globus
For simple grid operations there are many ways of
implementing them
●
different middleware available on different sites
JavaGAT
File.copy(...)
Grid Application
?
cp
ftp
gridftp
scp
http
●
submitJob(...)
?
fork
pbs
condor
unicore
globus
Which should you use ?
●
Some may not be available on all sites
●
Many combinations
JavaGAT
Grid Application
File.copy(...)
GAT
Remote
Files
Monitoring
Info
service
submitJob(...)
Resource
Management
GAT Engine
GridLab
Globus
gridftp
Unicore
SSH
P2P
Local
globus
Multimedia Content Analysis
Client
Servers
Ibis
(Java)
●
●
Runs simultaneously on clusters
(DAS-3, Japan, Australia), Desktop
Grid, Amazon EC2 Cloud
Connectivity problems solved
automatically by Ibis SmartSockets
Broker
Connection management
Standard sockets: only local VU
machines can be reached due to
firewalls problems
With SmartSockets: run everywhere
Ibis movie (part 1)
Performance on 1 DAS-3 cluster
●
Relative speedups of Java/Ibis and C++/MPI
●
●
Using TCP or Myricom’s MX protocol
Sequential performance Java: 80% of C++
Speedup (wide-area)
●
Homogeneous wide-area systems (DAS-3):
●
●
Frame rate increases linearly with #clusters
World-wide experiment:
●
●
24 frames per second
Speed limited by camera,
not computing infrastructure
Outline
●
Distributed systems
●
Distributed applications
●
The Ibis distributed programming system
●
Ibis in the mobile networked world
Smart Phones
●
GSM + PC + GPS + camera + networks + ….
●
Location-aware
●
●
What if everyone always carries a smart phone
(like a GSM now)?
Next wave in computing?
Ibis on Smart Phones
●
●
Our focus: distributed smart phone applications
●
Applications running on multiple phones
●
Integration with distributed computing backbone
Use Android for development
●
Google’s open-source platform
●
Java-based
Distributed applications
●
Disaster management (Katrina)
●
Use ad-hoc Wifi network when GSM network fails
●
Finding nearby people with certain skills
●
●
Distributed decision support
●
●
Bus drivers, CPR
Moving people to shelters (logistics)
Social networks
●
Similar issues
●
Find nearby friends, decide on restaurant
Wild example
●
Track position => automatic diary of your life
●
Cross-comparisons between diaries
Haven’t
we met
before?
Yes, on 23 Oct 2010, 3.48 pm at
N 52°22.688´ E 004°53.990´
eyeDentify
●
Object recognition on a G1 smartphone
●
Smartphone is a limited device:
●
●
Can run only 64 x 48 pixels (memory bound)
●
1024 x 768 pixels would take 5 minutes
Distributed Ibis version:
+
1024 x 768
pixels
+
= 2.0
seconds
Ibis movie (part 2)
Interdroid
Novel Mobile Distributed Applications
Data Management
Distributed Communication
Context Sensitive
Programming
Models
Current work
●
Raven: API for Viable Episodic Networking
●
Decentralized synchronization API
●
Fine grained control over data sharing
●
Bluetooth support for ad-hoc communication
●
Discovery of devices using multiple networks
●
Context Aware Programming Models
●
Supporting distributed decision making
●
Representing and using context (location etc.)
●
Exploiting social relationships (Hyves, Facebook)
Summary
●
Ibis provides integrated solutions for many
hard problems
●
●
●
●
performance, heterogeneity, malleability, fault
tolerance, connectivity
It combines functionality from programming
languages and operating systems
Used for many applications on real-world
distributed systems
Download from http://www.cs.vu.nl/ibis/
Acknowledgements
Niels Drost
Ceriel Jacobs
Roelof Kemp
Timo van Kessel
Thilo Kielmann
Jason Maassen
Rob van Nieuwpoort
Nick Palmer
Kees van Reeuwijk
Frank J. Seinstra
Kees Verstoep
Gosia Wrzesinska
Questions?
DAS-3
Download