powerpoint slides [155K]

advertisement
Mobile Network Estimation
Minkyong Kim, Brian Noble
Mobile Software Systems
University of Michigan
MOBILITY
Adaptive distributed systems
Many systems adapt to changes in network capacity
media-rich applications: web browsers, video players, …
performance enhancement: caching, prefetching, …
distributed systems: query planning, agent migration, …
All of these systems follow the same general form
observe network traffic at one or both endpoints
estimate the latency, bandwidth, loss rate, …
react if anything changes in an “interesting” way
All of this depends on estimating network capacity well
turns out to be a difficult problem
MOBILITY
Networks have variable performance
Sources of variation in mobile, wireless networks
nodes move, leading to unpredictable topology changes
often more than one connection alternative
physical layer subject to fading, shadowing, multi-path
Sources of variation in wide-area networks
bursty congestion over all time scales
routing changes between autonomous systems (BGP)
Typically, adaptive systems are evaluated very carefully
with respect to clean, idealized network changes
my own work in Odyssey is guilty as charged
MOBILITY
Goals of a good estimator
Estimate metrics that matter to the system
many network estimators focus on physical capacities
link capacity is like a “speed limit”
try driving the speed limit in LA during rush hour
instead: measure available capacities
Provide three characteristics
accuracy: gives correct estimates in steady state
agility: detect a true shift in capacity rapidly
stability: ignore short-lived transient changes
MOBILITY
Current estimators: EWMA filters
Most use exponentially weighted moving average filters
at each time step, incorporate new observation (Ocurrent)
with old estimate (Eold)
using a weighted linear combination:
Ecurrent = a(Eold) + (1-a)Ocurrent
The term a is called the gain
large gain: biases toward stability
small gain: biases toward agility
gain is set statically
You can’t have your cake and eat it too
MOBILITY
A tale of two estimators
TCP: a stable filter that is too stable
estimates round trip time (RTT): segment, ACK
stable estimator: gain set to 7/8
used to set retransmission timeout (RTO)
under rapidly escalating congestion, RTO grows too slowly
RTO adds “fudge factor” based on variance
Odyssey: an agile filter that is too agile
estimates latency and bandwidth for bulk transfers
applications react to change by changing fidelity
agile filter: gain set to 1/4 (latency) and 1/8 (bandwidth)
transient changes leads to “tail-chasing” adaptations
applications must add hysteresis to dampen transients
MOBILITY
The rest of this talk
Introduce a simple fluid flow network model
used to derive spot observations that are fed to filters
Describe three filters that adapt to prevailing conditions
error-based: vary gain based on quality of estimate
stability-based: vary gain based on observed noise
flip-flop: use a control to select an agile or stable filter
Evaluate the quality of these filters
subject each to a variety of networking conditions
compare agility and stability to TCP, Odyssey filters
MOBILITY
A fluid-flow network model
Our model is based on the packet-pair technique
model network path as single, bottleneck link
send two packets back to back from source to sink
sink ACKs both packets as they are received
spread between ACKs measures bandwidth along path
We need both bandwidth and latency
take two observations to solve for two unknowns
Several subtle points
depend only on passive traffic observations
spot observations filter out self interference
assumes symmetric network performance
MOBILITY
The error-based filter
Problem with EWMA filters comes from static gain
Instead, vary gain based on predictive quality of estimates
each estimate forms a prediction for next observation
at each observation, compare prediction with actual value
Scale gain with the accuracy of prediction
predictions that are accurate deserve higher weight
if inaccurate, should converge on observation quickly
Tends to ignore small changes, follow large changes
MOBILITY
Error-based filter in action
this is trouble
MOBILITY
The stability-based filter
The error-based filter will be “pulled” by large transients
will tend towards instability during transient dips
Instead, base gain on stability in recent observations
moving range: difference between adjacent observations
noisy observations lead to larger moving ranges
Scale gain with the magnitude of the moving range
when observations are noisy, each deserves less weight
when observations are stable, changes more significant
Tends to ignore large changes, follow small ones
MOBILITY
Stability-based filter in action
this is trouble
MOBILITY
Subtleties in variable-gain filters
The gain in each is based on some source metric
Gain must be in the range [0..1]
need some way of scaling the source metric
determine the maximum {error, instability} recently seen
scale current {error, instability} relative to maximum
Transient changes in source metric have drastic effects
smooth observed source metrics by secondary filter
secondary filter has static gain (!)
rather than provide tertiary filter, tune empirically
Sometimes, variable-gain filters are neither agile nor stable
source metric places them somewhere in the middle
MOBILITY
A short detour: statistical process control
Suppose you had a machine that built widgets
widgets specified to have some size, error tolerance
How do you know your machine is building good widgets?
idea: periodically grab k widgets, measure them
if average size is about what you expect, things are OK
if not, machine is probably out of control
Formalizing this idea: the control chart
population mean, m
sample standard deviation, s
control lines: m+3s, m-3s
the 3s rule: stay inside the lines
MOBILITY
m+3s
m
m-3s
The flip-flop filter
Use a control chart to select for agility, stability
run two static-gain EWMA filters in parallel
maintain a control chart for each observation
if within control limits, use agile filter (a = 0.1)
otherwise, use stable filter (a = 0.9)
Cannot apply simple control chart directly to this problem
true mean is not known, and it changes over time
sample standard deviation is not known
Use approximations (individual x-chart)
m follows simple smoothed estimate of observations
s approximated with 2-element moving range
MOBILITY
Flip-flop filter in action
switch to agile filter
switch to stable filter
MOBILITY
Evaluating candidate filters
Can these filters be as agile as the Odyssey filter…
in recognizing a true change in link bandwidth?
in reacting to the presence of cross traffic?
in detecting a change in ad hoc topology?
in detecting a wide-area route change?
Can these filters be as stable as the TCP filter…
in resisting a transient change in link bandwidth?
in tolerating the presence of cross traffic?
in tolerating retransmissions in ad hoc networks?
in tolerating noise across a real wide-area network?
Can they predict in an ad hoc network with cross traffic?
MOBILITY
Experimental methodology
All experiments in this talk used ns, a network simulator
the wide-area set are based on live network traces
Extensions to support variable-link experiments
script controls base physical performance of a link
can vary latency, bandwidth over time
Ad hoc networking simulations include Monarch extensions
collision-avoidance
link-level ACK, retransmission
In each experiment, filters converge to same value
they do not differ in accuracy
only differences in agility, stability
MOBILITY
Link changes
First set of experiments: impulse-response tests
connect client, server with a single ns link
vary link performance with a variant of a square wave
persistent change: decrease from 10Mb/s to 1Mb/s
transient change: dip from 10Mb/s to 1Mb/s and back
Vary number of request/response pairs exposed to change
poisson request generator, random response size
Agility: measured by settle time
time to reach an estimate within 10% of nominal
Stability: measured by mean squared error
penalizes large, short disturbances more than small, long
MOBILITY
Agility for step-down waveform
Settle time (sec)
100
FF
SF
EF
Ody
TCP
10
1
0.1
1
2
3
4
Packets per second (avg)
MOBILITY
5
Stability for impulse-down waveform
Mean squared error
0.030
0.025
FF
SF
EF
Ody
TCP
0.020
0.015
0.010
0.005
0.000
1
2
3
4
Packets during transient
MOBILITY
5
Cross traffic experiments
Start request/response traffic between client and server
at 50 seconds, inject 5Mb/s cross traffic
All filters slightly optimistic in estimates
not all packets see full queue delays
Agility: settle time
Stability:
coefficient
of variance
congestion sink
router A
router B
client
congestion source
MOBILITY
server
Cross traffic results: agility
6
Settle time (sec)
5
FF
SF
EF
Ody
TCP
4
3
2
1
0
Traffic On
MOBILITY
Traffic Off
Cross traffic results: stability
Coefficient of Variation (%)
25
20
FF
SF
EF
Ody
TCP
15
10
5
0
Traffic On
MOBILITY
Traffic Off
Simple ad hoc topology changes
Place three server/router nodes in a line
single client walks from server to end of line, and back
topology changes at each stage
Agility results do not add much new information
similar to congestion: TCP is bad, rest are comparable
Stability results are useful
coefficient of variation
after settle time
server
node A
stage 2
node B
stage 1
stage 3
client
MOBILITY
stage 5
stage 4
Coefficient of Variation (%)
Stability results: topology changes
60
50
FF
SF
EF
Ody
TCP
40
30
20
10
0
Stage 2
Stage 3
Stage 4
Stage 5
Position of mobile client
MOBILITY
Summary of comparisons
stability
agility
FF
MOBILITY
Step Up
Step Down
Congestion
Wide-Area
Mobile
Transient
Congestion
Wide-Area
Mobile
SF
EF
Ody TCP
Acid test: predicting ad hoc performance
Typical ad hoc simulation
50 nodes in 1500x500 meter space
initial locations randomly distributed throughout space
nodes move in random waypoint model
Nodes are formed into 25 pairs
one pair is our test client/server: poisson traffic
remaining 24 pairs exchange CBR traffic
vary rate of congestion traffic across experiments
No filter does particularly well
two static filters are worst performers
flip-flop is best of the bunch
MOBILITY
Ad hoc accuracy results
Average Estimated Error (s)
2.5
2
FF
SF
EF
Ody
TCP
1.5
1
0.5
0
64
128
256
512
1024
Size of CBR packets (bytes)
MOBILITY
2048
Related Work
S. Keshav: introduced packet-pair, bottleneck bandwidth
fuzzy estimator: similar to error-based estimator
analysis for rate-allocating servers (not FCFS)
Packet-pair extensions
Paxson: receiver-based packet pair: time at both ends
Lai: receiver-only packet pair: time at receiver
Active probing: Bolot, Downey, Carter & Crovella, …
measurement load can be substantial
Lai’s general network model, packet tailgating technique
Balakrishnan’s congestion manager: unified RTT observations
can benefit from our filters for better estimates
MOBILITY
Conclusions
Adaptive systems depend on quality of measurement
particularly hard to estimate network capacity
Standard filtering techniques: agile or stable, but not both
Adaptive filters: tune for prevailing network conditions
agile when possible, stable when necessary
Best alternative: flip-flop filter
composition of two static-gain EWMA filters
statistical process control used to select between them
comparable to Odyssey’s agile filter in 4/5 scenarios
comparable to TCP’s stable filter in 3/4 scenarios
provides best predictions in complex ad hoc network
MOBILITY
Questions?
Further details: http://mobility.eecs.umich.edu/
Preprint of the paper is available
MOBILITY
Download