Talk slides. - Computer Science

advertisement
A Constraint Satisfaction Approach to
Testbed Embedding Services
John Byers
Dept. of Computer Science, Boston University
www.cs.bu.edu/~byers
Joint work with Jeffrey Considine (BU)
and Ketan Mayer-Patel (UNC)
Experimental Methodologies
• Simulation
 “Blank slate” for crafting experiments
 Fine-grained control, specifying all details
 No external surprises, not especially realistic
• Emulation
 All the benefits of simulation, plus:
running real protocols on real systems
• Internet experimentation
 None of the benefits of simulation, minus:
unpredictability, unrepeatability, etc.
 But realistic!
Our Position
• All three approaches have their place.
• Improving aspects of all three is essential.
 Focus of recent workshops like MOME Tools
 Internet experimentation is the most primitive by far.
• Our question:
Can we bridge over some of the attractive features of
simulation and emulation into wide-area testbed
experimentation?
• Towards an answer:
 Which services would be useful?
 Outline design of a set of interesting services.
Useful Services
• Canonical target testbed: PlanetLab
• What services would we like to bridge over?
 Abstract: repeatability, representativeness
 Concrete:
• specify parameters of an experiment just like in ns
• locate one or more sub-topologies matching specification
• run experiment
• monitor it while running (“measurement blackboard”)
• put it all in a cron job
Embedding Services
1. Topology specification
2. Testbed characterization
•
Relevant parameters unknown, but measurable
3. Embedding discovery
•
Automatically find one or more embeddings of
specified topology
Synergistic relationships between above services.


Existing measurements guide discovery.
Discovery feeds back into measurement process.
Emulab/Netbed
• In the emulation world, Emulab and Netbed
researchers have worked extensively on
related problems [OSDI ’02, HotNets-I, CCR ’03]
• Rich experimental specification language.
• Optimization-based solver to map desired
topology onto Netbed to:




balance load across Netbed processors
minimize inter-switch bandwidth
minimize interference between experiments
incorporate wide-area constraints
Wide-area challenges
• Conditions change continuously on wide-area
testbeds - “Measure twice, embed once”.
• The space of possible embeddings is very
large; finding feasible ones is the challenge.
• We argue for a constraint satisfaction
approach rather than optimization-based.
 Pros and cons upcoming.
Specifying Topologies
• N nodes in testbed, k nodes in specification
• k x k constraint matrix C = {ci,j}
• Entry ci,j constrains the end-to-end path
between embedding of virtual nodes i and j.
• For example, place bounds on RTTs:
ci,j = [li,j, hi,j] represents lower and upper bounds on target RTT.
• Constraints can be multi-dimensional.
• Constraints can also be placed on nodes.
• More complex specifications possible...
Feasible Embeddings
• Def’n: A feasible embedding is a mapping f
such that for all i, j where f(i) = x and f(j) = y:
li,j ≤ d (x, y) ≤ hi,j
• Do not need to know d (x, y) exactly, only that
li,j ≤ l’(x, y) ≤ d (x, y) ≤ h’ (x, y) ≤ hi,j
• Key point: Testbed need not be exhaustively
characterized, only sufficiently well to embed.
Why Constraint-Based?
• Simplicity: binary yes-no answer.
• Allows sampling from feasible embeddings.
• Admits a parsimonious set of measurements
to locate a feasible embedding.
• For infeasible set of constraints, hints for
relaxing constraints can be provided.
• Optimization approaches depend crucially on
user’s setting of weights.
Hardness
• Finding an embedding is as hard as subgraph
isomorphism (NP-complete)
• Counting or sampling from set of feasible
embeddings is #P-Complete.
• Approximation algorithms are not much
better.
• Uh-oh...
Our Approach
• Brute force search.
• We’re not kidding.
• Situation is not as dire as it sounds.
 Several methods for pruning the search tree.
 Adaptive measurements.
 Many problem instances not near boundary of
solubility and insolubility.
• Off-line searches up to thousands of nodes.
• On-line searches up to hundreds of nodes.
Adaptive Measurements
• Must we make O(N2) measurements? No.
 Delay: coordinate-based [Cox et al (today)]
 Loss: tomography-based [Chen et al ’03]
 Underlays may make them for you [Nakao et al ’03]
• In our setting:
 We don’t always need exact values.
 Pair-wise measurements are expensive.
• How do we avoid measurements?
 Interactions with search.
 Inferences of unmeasured paths.
Triangle Inequality Inferences
Suppose constraints are on delays, and the triangle inequality holds.
j
[10, 15 ]
[ 90, 100 ]
i
[ 75 , 115 ]
k
Using APSP algorithms, can compute all upper & lower bounds.
Experimental Setup
• Starting from PlanetLab production node list,
we removed any hosts…
 not responding to pings
 with full file systems
 with CPU load over 2.0 (measured with uptime)
• 118 hosts remaining
• Used snapshot of pings between them
Finding Cliques
• Biggest clique of nodes within 10 ms
 Unique 11 node clique covering 6
institutions
• If 1ms lower bound added,
 Twenty 6 node cliques
 5 institutions always present, only 2 others
Size
2
3
4
5
6
7
8
9
10
11
0-10ms
Cliques
403
8
936
5
1475
9
1645
17
1327
8
771
8
315
8
86
3
14
3
1
1-10ms
Cliques
325
35
501
61
387
84
142
60
20 0
0
0
0
0
Finding Groups of Cliques
# of
Cliques
Clique Size
1
2
325
3
501
4
387
5
6
142 20
2
6898 6238 1004
0
0
0
3
12950
0
0
0
0
0
4
0
0
0
0
0
0
1-10ms within same clique, 20-50ms otherwise
7
0
Triangle Inequality in PlanetLab
• In our PlanetLab snapshot, 4.4% of all
triples i,j,k violate the triangle inequality
• Consider a looser version of TI, e.g.
di,j ≤ α ( di,k + dk,j ) + β
• There are fewer than 1% violations if
 α = 1.15, β = 1 ms
 α = 1.09, β = 5 ms
Inference Experiments
• Metric: mean range of inference matrix
 M   
i
j i
hi , j  li , j
n
 
2
• Compare measurement orderings
 Random
 Smallest Lower Bound First (greedy)
 Largest Range First (greedy)
Inference Results
Random order performs poorly
Largest range first performs best
Future Work
• Build a full system




More interesting constraints
Better search and pruning
Synergistic search and measurement
Integration with simulation/emulation tools
• Other questions
 What do search findings say about
PlanetLab?
 Can we plan deployment?
Download