Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh

advertisement
Bullet: High Bandwidth Data
Dissemination Using an Overlay
Mesh
Introduction


Given a sender and a large set of
receivers spread across the Internet,
how can we maximize the bandwidth?
Problem domain:


Software or video distribution
Real-time multimedia streaming
Existing Solutions


IP multicast does not consider
bandwidth when constructing its
distribution tree
A promising alternative: Overlay


Attempts to mimic the multicast routing
trees
Instead of high-speed routers, use
programmable end hosts as interior nodes
in the overlay tree
Existing Solutions

A tree structure is problematic



Decreasing bandwidth as moving down a
tree
Any loss high up the tree will reduce the
bandwidth lower down the tree
Bandwidth of a node limited by its single
parent
A New Approach


Transmit disjoint data set to various
points in the network
A node download from multiple sources
rather than a single parent

Higher reliability
Conceptual Model
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Conventional Model
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Conventional Model
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet
Root
1 Mbps
1 Mbps
1 Mbps
A
B
Bullet Properties

TCP friendly


Low control overhead



Probing resource
Locating multiple downloading sources
Decentralized and scalable


Must respond to congestion signals
No global knowledge
Robust to failures, even high up in a
tree
Bullet Overview

Use meshes as opposed to trees


Bandwidth independent of the underlying
overlay tree
Used a 1,000-node overlay and 20,000
network topologies


Up to 2x bandwidth improvement over a
bandwidth-optimized tree
Overhead of 30 Kbps
System Components





Split the data into packet-sized objects
Disseminate disjoint objects to clients at
a rate determined by bandwidth to each
client
Nodes need to locate and retrieve
disjoint data from their peers
Periodically exchange summary tickets
Minimize overlapping objects from each
peer
Illustration
1 2 3 4 5 6 7
S
1 2 3 5
1 2 5
1 3 4 6
A
B
D
E
2 4 5 6
C
1 3 4
Data Encoding


Multimedia: MDC encoding
Large files: erasure encoding



Tornado code
Only need to locate 5% extra packets to
reconstruct the original message
Faster encoding and decoding time
RanSub



Distributes random subsets of
participating nodes
During the collect phase, each node
sends a random subset of its
descendant nodes up the tree
During the distribute phase, each node
sends a random subset of collected
nodes down the tree
Informed Content Delivery
Techniques


Use summary tickets
A summary ticket is an array


array[i] = hashi(working set)
Check ticket elements against a Bloom
filter


It is possible to have false positives
It is possible that B will not send a packet
to A even though A is missing it
TCP Friendly Rate Control (TFRC)

TCP halves the sending rate as soon as
one packet loss is detected


Too severe
TFRC is based on loss events, or
multiple dropped packets within one
round-trip time
TCP Friendly Rate Control (TFRC)

Bullet eliminated retransmission from
TFRC


Easier to recover from other sources than
from the initial sender
TFRC does not aggressively seek newly
available bandwidth like TCP
Bullet

Layers a mesh on top of an overlay tree
to increase overall bandwidth
Finding Overlay Peers




RanSub periodically delivers subsets of
uniformly random selected nodes
Via summary tickets (120 bytes per
node)
The working set from each node is
associated with a Bloom filter
Peer with nodes with the lowest
similarity ratio
Recovering Data from Peers



A receiver assigns a portion of the
sequence space to each of its senders,
to avoid duplication among senders
A receiver periodically updates each
sender with its current Bloom filter and
the range of sequences covered in its
Bloom filter
Less than 10% of all received packets
are duplicates
Making Data Disjoint


Given a randomly chosen subset of peer
nodes, it is about the same probability
that each node has a particular data
packet
A parent decides the portion of its data
being sent to each child

A function of limiting and sending factors
Making Data Disjoint

The portion of data a child should own
is proportional to



The number of its descendants
Bandwidth
If not enough bandwidth

Each child receives a completely disjoint
data stream
Making Data Disjoint

If ample bandwidth

Each child will received the entire parent
stream
Improving the Bullet Mesh

What can go wrong




Not enough peers
Constant changing network
Use trial senders and receivers
Bullet periodically evaluates the
performance of its peers

Places the worst performing
sender/receiver
Evaluation



Used Internet environments and
ModelNet IP emulation
Deployed on the PlanetLab
Built on MACEDON


Specifies the overlay algorithms
Core logic under 1,000 lines of code
Evaluation

ModelNet experiments




50 2 GHz Pentium 4’s running Linux 2.4.20
100 Mbps and 1 Gbps Ethernet switches
1,000 emulated instances
20,000 INET-generated topologies
Offline Bottleneck Bandwidth Tree

Given global knowledge, what is the
overlay tree that will deliver the highest
bandwidth to a set of overlay nodes?


Finding a tree with a maximum bottleneck
NP hard in general
Offline Bottleneck Bandwidth Tree

Assumptions




The path between two overlay nodes is
fixed
The overlay tree uses TCP-friendly unicast
connections to transfer data point-to-point
In the absence of other flows, we can
estimate the throughput of a TCP-friendly
flow using a steady-state formula
In the case of sharing, each flow can
achieve at most of 1/nth of the total
throughput
Bullet vs. Streaming


The maximum bottleneck tree achieves
5x bandwidth compared to random
trees
Bullet outperforms the bottleneck tree
by a factor up to 100%
Creating Disjoint Data

Without disjoint transmission of data

Bullet degrades by 25%
Epidemic Approaches

Bullet is 60% better than anti-entropy
(gossiping) approaches

Epidemic algorithms have an excessive
number of duplicate packets
Bullet on a Lossy Network

Bullet achieves 2x bandwidth compared
to maximum bottleneck trees
Performance Under Failure

With failure detection/recovery disabled


30% performance degradation with a
missing child under the root
With failure detection/recovery enabled

Negligible disruption of performance
PlanetLab


47 nodes for deployment
Similar results
Related Work

Kazza




BitTorrent


Perpendicular downloads
Does not use erasure code
Bandwidth consuming
Centralized tracker
FastReplica

Not bandwidth aware
Related Work

Scalable Reliable Multicast


Epidemic approaches


Difficult to configure
Do not avoid duplicates
Narada


Use overlay meshes
Bandwidth still limited by parent
Related Work

Overcast


Splitstream


More heavyweight when nodes leave a tree
Not bandwidth aware
CoopNet

Centralized
Download