BBRx - Extending BBR for Customized TCP Performance

advertisement
BBRx: Extending BBR for
Customized TCP Performance
NetDev 0x12, Montreal, Canada
Jae Won Chung, Feng Li and Beomjun Kim
[email protected]
{feng.li, beomjun.kim}@verizon.com
Objectives
• BBR is a promising next-gen TCP congestion avoidance candidate, but
has a room for performance improvements – better manage
bottleneck queuing delay.
• Extend BBR to provide a method to find an optimal throughput-delay
operation point in LTE environment to maximize per-flow throughput
within a bounded loss rate and delay.
• Practically deploy learning technics to ensure optimal TCP
performance all the time while minimizing the risk of deploying pretuned learning algorithm in the fast path.
Theory Behind BBRx – PI Controller for TCP
Estimation
Noise
Theory Behind BBRx – PI Controller for TCP
BBR is a special case of BBRx
Estimation
Noise
where γ = 1 and β = 0
Code Added to tcp_bbr.c (1)
static void bbr_main(struct sock *sk, const struct rate_sample *rs)
{
struct bbr *bbr = inet_csk_ca(sk);
u32 bw;
bbr_update_model(sk, rs);
bw = bbr_bw(sk);
bw = bbrx_target_bw(sk, rs, bw);
bbr_set_pacing_rate(sk, bw, bbr->pacing_gain);
bbr_set_tso_segs_goal(sk);
bbr_set_cwnd(sk, rs, rs->acked_sacked, bw, bbr->cwnd_gain);
}
•
•
•
Baselined at linux-4.15.18
ICSK_CA_PRIV_SIZE = 104
Added struct tcp_bbrx_info
to tcp_cc_info
Code Added to tcp_bbr.c (2)
/* BBRX: Find target tx-rate (bw) using PI control logic */
static u32 bbrx_target_bw(struct sock *sk, const struct rate_sample *rs, u32 bw)
{
struct bbr *bbr = inet_csk_ca(sk);
struct tcp_sock *tp = tcp_sk(sk);
u32 tgt_rtt_us, min_rtt_us, rtt_us=0;
u32 tgt_bw, tgt_bw_adj;
min_rtt_us = bbr->min_rtt_us;
if (unlikely(min_rtt_us == 0)) {
min_rtt_us = tcp_min_rtt(tp);
if (unlikely(min_rtt_us == 0))
min_rtt_us = 100; //assume minRTT of 100us
}
if (rs->rtt_us > 0)
rtt_us = rs->rtt_us;
tgt_rtt_us = (bbr->k * min_rtt_us) / BBRX_K_MIN;
}
/* PI control logic */
if (rtt_us > tgt_rtt_us) {
tgt_bw_adj = bbr->beta * bw / 100;
tgt_bw_adj *= rtt_us - tgt_rtt_us;
tgt_bw_adj /= min_rtt_us;
tgt_bw = bw * BBRX_GAMMA / 100
- tgt_bw_adj;
}
else {
tgt_bw = bw * BBRX_GAMMA / 100;
if (BBRX_PI_UP_CTR) {
tgt_bw_adj = bbr->beta * bw / 100;
tgt_bw_adj *= tgt_rtt_us - rtt_us;
tgt_bw_adj /= min_rtt_us;
tgt_bw += tgt_bw_adj;
}
}
return max_t(u32, BBRX_TGT_BW_MIN, tgt_bw);
Control Parameter Values?
• Target Utilization γ = 1 (or a little less, say 0.98)
• Epoch δ = min RTT (Rmin)
• Target RTT deciding factor k ≥ 4
Not trivial to pick !
• Reduced PI parameter β > 0 (perhaps < 1)
• Tuning Challenges: The optimal parameter range may differ for
network conditions (C and Rmin)
Learning Agent
• Tuning Methods
(user space)
Flow
Config
• Frequency Response Analysis Stats
Params
• Empirically using Learning
TCP Stack
Learning Agent: BBRx Auto-Tuning Method
• Subscribe to TCP flow stats via NetLink socket.
• Classify TCP flows into different bins based on
the reported bottleneck bandwidth (C) and Rmin
• For each traffic class bin,
- Compute average utility of flows when
enough samples are collected (default 40)
- Find minimum k and the corresponding β
that yields the highest average utility (U)
using gradient ascendant algorithm.
- Update BBRx kernel module parameter table
{ …
}
}
"cong_control": {
"bbrx": {
"bbrx_bw_lo": 12584068,
"bbrx_bw_hi": 0,
"bbrx_min_rtt": 1845,
"bbrx_brst_len": 9973,
"bbrx_brst_tput": 48788,
"bbrx_brst_ploss": 0,
"bbrx_brst_k": 6,
"bbrx_brst_beta": 50
}
}
Learning Agent: BBRx Auto-Tuning Method
• Subscribe to TCP flow stats via NetLink socket.
• Classify TCP flows into different bins based on
the reported bottleneck bandwidth (C) and Rmin
• For each traffic class bin,
- Compute average utility of flows when
enough samples are collected (default 40)
- Find minimum k and the corresponding β
that yields the highest average utility (U)
using gradient ascendant algorithm.
- Update BBRx kernel module parameter table
{ …
}
}
"cong_control": {
"bbrx": {
"bbrx_bw_lo": 12584068,
"bbrx_bw_hi": 0,
"bbrx_min_rtt": 1845,
"bbrx_brst_len": 9973,
"bbrx_brst_tput": 48788,
"bbrx_brst_ploss": 0,
"bbrx_brst_k": 6,
"bbrx_brst_beta": 50
}
}
Utility: Queue Length vs. Optimal β Range
• BBRx becomes BBR for a small
β = 0.01, and yields a lower flow
utility due to overflows as the
bottleneck buffer size is reduced.
• Large β beyond the optimal value
(β = 0.6) decreases the utility due
to control instability (Larger
magnitude sinusoidal pattern of
high queuing delay followed by
link under-utilization).
• System has a wide range of stable
β [0.2, 0.6] providing a large
margin of configuration freedom.
Utility: BW vs. RTT vs. Optimal β Range
• No single optimal range to cover all networking conditions.
• The stable β margins are wide (reduced as BW reduces and RTT grows).
BBW (C) = 15Mbps
BBW (C) = 35Mbps
BBRx Params: γ=1, k=6; Bottleneck: qlen = 175ms, flows = 6
BBW (C) = 75Mbps
BBRx Multi-Range Configuration Table
• Network condition-based configuration approach
• BBRx sender start with default parameter set (β = 0.45, k=4)
• BBRx sender refers to the table when entering PROBE_BW state
• Learning Agent daemon update each bin separately
Recommended Default Values
Based on Emulation Results
Rmin (ms)
C (Mbps)
[0,3)
[3,10)
[10,1k)
[1k,∞)
[0,50)
β = 0.75, k=4
β = 0.75, k=4
β = 0.75, k=4
β = 0.75, k=4
[50,100)
β = 0.45, k=4
β = 0.45, k=4
β = 0.75, k=4
β = 0.75, k=4
[100,∞)
β = 0.25, k=4
β = 0.45, k=4
β = 0.75, k=4
β = 0.75, k=4
Preliminary Evaluations
<General Purpose Config – 1st Delay, 2nd Goodput>
Emulation Topology
Host
Container
Iperf3 -c
Iper3 –s
bridge
veth0
veth1
> tc qdisc show dev veth0
qdisc netem 1: root refcnt 2 limit
1000 delay 12.0ms
qdisc tbf 2: parent 1:1 rate
100Mbit burst 384Kb lat 132.4ms
> tc qdisc show dev veth1
qdisc netem 3: root refcnt 2 limit
1000 delay 13.0ms
1-Flows Test: C=100Mbps, Rmin=25ms, qlen=132ms
CUBIC
[ ID]
[ 4]
[ 4]
BBR
[ ID]
[ 4]
[ 4]
Interval
0.00-20.00 sec
0.00-20.00 sec
Transfer
Bandwidth
Retr
213 Mbytes 89.4 Mbits/sec 104
211 Mbytes 88.6 Mbits/sec
sender
receiver
Interval
0.00-20.00 sec
0.00-20.00 sec
Transfer
Bandwidth
Retr
229 Mbytes 96.0 Mbits/sec 0
226 Mbytes 94.6 Mbits/sec
sender
receiver
BBRx (γ=0.98, k=4, β=0.5)
[ ID]
[ 4]
[ 4]
Interval
0.00-20.00 sec
0.00-20.00 sec
Transfer
Bandwidth
Retr
225 Mbytes 94.4 Mbits/sec 0
221 Mbytes 92.9 Mbits/sec
sender
receiver
C = 100Mbps, Rmin = 25ms, qlen = 132ms, flow = 1
CUBIC (owin)
BBR (owin)
BBRx (owin)
γ=0.98, k=4, β=0.5
CUBIC (RTT)
BBR (RTT)
BBRx (RTT)
γ=0.98, k=4, β=0.5
4-Flows Test: C=10Mbps, Rmin=25ms, qlen=132ms
CUBIC
[ ID]
[SUM]
[SUM]
Interval
Transfer
Bandwidth
Retr
0.00-20.00 sec 235 MBytes 98.4 Mbits/sec 123
0.00-20.00 sec 224 MBytes 93.9 Mbits/sec
sender
receiver
Interval
Transfer
Bandwidth
Retr
0.00-20.00 sec 240 MBytes 100 Mbits/sec 0
0.00-20.00 sec 225 MBytes 94.3 Mbits/sec
sender
receiver
BBR
[ ID]
[SUM]
[SUM]
BBRx (γ=0.98, k=4, β=0.5)
[ ID]
[SUM]
[SUM]
Interval
Transfer
Bandwidth
Retr
0.00-20.00 sec 239 MBytes 100 Mbits/sec 843
0.00-20.00 sec 224 MBytes 93.8 Mbits/sec
sender
receiver
C = 100Mbps, Rmin = 25ms, qlen = 132ms, flow = 4
CUBIC (owin)
CUBIC (RTT)
BBR (owin)
BBRx (owin)
γ=0.98, k=4, β=0.5
BBR (RTT)
BBRx (RTT)
γ=0.98, k=4, β=0.5
BBRx RTTs
Scenario:
C = 100Mbps,
Rmin = 25ms,
qlen = 132ms,
flow = 4
BBRx Params:
γ=0.98,
k=4,
β=0.5
Preliminary Evaluations
<Wireless PEP Config>
1st Goodput, 2nd Delay, and
Small number of Flows per Device
1-Flow Test: C=100Mbps, Rmin=25ms, qlen=132ms
BBRx: RTT (γ=1, k=6, β=0.5)
Tput = 93.1 Mbits/sec
ReTx = 0
BBRx: RTT (γ=1, k=8, β=0.5)
Tput = 93.6 Mbits/sec
Re-Tx = 0
4G Stationary Test – Good RF (SINR > 25dB)
• 100MB file download
• BBR flow averaged 92.6 Mbps.
• BBRx flow averaged 110.2 Mbps
BBRx: owin
γ=1, k=6, β=0.5
Galaxy S7
LTE-Advance w/
Carrier Aggregation
BBRx: RTT
γ=1, k=6, β=0.5
HP 460c Gen8
Summary
• BBRx: Introduce PI control function to BBR
• Export per-flow TCP stats to user-space via NetLink socket
• Learning Agent:
• Adopts a utility function to score average TCP CA performance and adjust the
BBRx control parameter to yield the best utility while keep the RTT to
minimum.
• The loosely coupled TCP tuning feedback control loop provides a novel way to
monitor and adjust TCP parameters per the performance goal in real time
while minimizing the risk of deploying pre-tuned learning algorithm in the fast
path.
Current Status
• Preliminary evaluation results looks promising
• BBRx reduces shallow buffer overflows as reacting to RTT
• Customize TCP performance for LTE access network to find an optimal
throughput and delay operation point (increase k till find maximum U).
• Proposed a kernel patch (pending) to get TCP congestion control
information via NetLink socket on flow termination event.
• BBRx and the TCP stat collector code available at:
https://github.com/ultragoose/bbrx
Future Works
• More evaluations under 4G, 5G and Satellite environments
• Evaluate fairness among BBRx flows.
• Evaluate a PI control function variations such as one used in ABC.
Questions?
Download