FCP: A Flexible Transport Framework for Accommodating Diversity Dongsu Han (KAIST) Robert Grandl† , Aditya Akella†, Srinivasan Seshan* † University of Wisconsin-Madison *Carnegie Mellon University 1 Evolution of Transport Protocols Reliability/Loss recovery In-order delivery Flow control Congestion control Transport Network (IP) 2 Evolution of Transport Protocols Transport Multipath Quality of Experience (QoE) 3 Evolution of Transport Protocols Transport Support for diversity: Multi-homing, multiple paths, multiple sub-streams, multimedia streaming, Web applications, Multipath Data transfer with a deadline Quality of Experience (QoE) 4 Evolution of Transport Protocols Reliability/Loss recovery In-order delivery Flow control Congestion control Transport 5 Evolution of Transport Protocols Resource Allocation Congestion control Transport Coordination Fairness and efficiency 6 Problems in Supporting Diversity in Congestion Control RCP XCP [sigcomm] D3 [sigcomm] TCP Cannot ensure coexistence. 7 Evolution of TCP SACK (1996) NewReno (1999) MPTCP TCP Friendly TCP Nice Rate Control FastTCP 2010 MulTCP TCP BIC, CUBIC 2008 Explicit Congestion Notification TCP windowscaling, 1988 NewReno MPTCP CUBIC TFRC 1. End-point flexibility: Purely end-point based 2. Coexistence: Invariant for fairness (TCP friendliness) 8 End-point based vs. Router-Assisted End-point based [TCP] High Flexibility, Diversity Can we achieve the best of both worlds? Router-Assisted [XCP, RCP] Feedback on network’s state High Efficiency 9 Our Approach Fairness Efficiency Flexible Control 1. Decouple coexistence issues 2. Introduce generic abstractions for resource (fairness and efficiency) from allocation. end-point control End-host Network’s state (feedback) End-point’s future behavior (feed-forward) Network 10 1 Decoupling for Flexibility Sender Receiver Network Flow 1 ($/sec) ... Budget W PRICE Flow 2 Sender can distribute its Flow n budget to its flows. Flow i’s price: Pi ($/Byte) Flow i’s budget wi, subject to W ≥ Σwi Flow i’s rate: Ri = budget/price = wi /Pi (Byte/sec) 11 1 Decoupling for Flexibility Sender Receiver Network Flow 1 ($/sec) Invariant for fairness Flow i’s budget wi, subject to W ≥ Σwi ... Budget W PRICE Flow 2 2. Flexibility in network Flow n price generation. 1. Flexibility at the end-points in how its budget is used. 12 2a Feedback: Pricing • Feedback: “congestion price” reflecting the “cost” of sending data across the link [Kelly] P3 P2 P1 Price = p3 Price = p3+p2 Price = p3+p2+p1 Round Trip Time Round Trip Time Round Trip Time Congestion Header Congestion Header Congestion Header 13 2a Feedback: Pricing • Feedback: “congestion price” reflecting the “cost” of sending data across the link [Kelly] P3 Sender updates the rate. Rate = budget/price P2 P1 Price feedback = p3+p2+p1 Price Echoback This implements proportional fairness [Kelly]. 14 2a Price Calculation Feedback: Pricing Router updates the price, upon packet reception. Load Averaging window = 2 x AvgRTT Recent load Time Price ($/byte) = f(Average recent load) 15 2a Price Calculation Feedback: Pricing Router updates the price, upon packet reception. Instantaneous Incoming budget ($) = Price ($/byte) x Bytes Received (bytes) Averaging window = 2 x AvgRTT Incoming budget Time Average Incoming Budget Price ($/byte) = Remaining Link Capacity 16 2a Price Calculation Feedback: Pricing Router stores recent history of price, price(t) Instantaneous Incoming budget ($) price(t - rtt)× size× dt New Price Round Trip Time (rtt) Incoming budget I(t)= ò price(t - rtt)bytes(t)dt Time Average Incoming Budget Price ($/byte) = Remaining Link Capacity 17 2b Feed-forward: Preloading 1 ($/sec) 10 ($/sec) Sender PRICE Rate = 1/p Bps Price: p $/byte Rate = 10/p Bps 18 2b Feed-forward: Preloading 1 ($/sec) 10 ($/sec) Sender Price Price Round Trip Time Round Trip Time Preload=9 Preload=9 Congestion Header Rate = 10/p’ Incoming budget ($) Rate = 1/p (1+preload) x price(t-rtt) x bytes Price goes up (from p to p’) 19 Evaluation 1. Basic performance of FCP Flow 1 ($/sec) ... Budget W PRICE Flow 2 2. End-point budget allocation. Flow n 3. Flexibility in network price generation. 20 Sending Rate (Mbps) Fast Convergence/Accurate Feedback 100 RTT = {25ms, 50ms, 125ms, 250ms, 625ms } 80 60 40 20 0 2 4 6 8 10 12 14 Time 21 Sending Rate (Mbps) Fast Convergence/Accurate Feedback 100 Ideal 80 RTT = {25ms, 50ms, 125ms, 250ms, 625ms } 60 40 20 0 2 4 6 8 10 12 14 Time 22 Sending Rate (Mbps) Fast Convergence/Accurate Feedback 100 Ideal 80 RTT = {25ms, 50ms, 125ms, 250ms, 625ms } 60 40 20 0 2 4 6 8 10 12 14 Time 23 Sending Rate (Mbps) Fast Convergence/Accurate Feedback 100 Ideal 80 RTT = {25ms, 50ms, 125ms, 250ms, 625ms } 60 40 20 0 2 4 6 8 10 12 14 Time 24 Sending Rate (Mbps) Fast Convergence/Accurate Feedback 100 Ideal 80 RTT = {25ms, 50ms, 125ms, 250ms, 625ms } 60 40 20 0 2 4 6 8 10 12 14 Time 25 Sending Rate (Mbps) Fast Convergence/Accurate Feedback 100 Ideal 80 RTT = {25ms, 50ms, 125ms, 250ms, 625ms } 60 40 20 0 2 4 6 8 10 12 14 Time 26 Sending Rate (Mbps) Fast Convergence/Accurate Feedback 100 Ideal 80 60 40 20 0 4 6 8 Sending Rate (Mbps) 2 10 12 14 FCP Time 27 100 Ideal 80 60 40 20 Sending Rate (Mbps) Sending Rate (Mbps) Fast Convergence/Accurate Feedback RCP Overloaded when new flows arrive 0 4 6 8 Sending Rate (Mbps) 2 10 12 14 FCP Time 28 Ideal 80 60 40 20 Sending Rate (Mbps) 100 RCP Overloaded when new flows arrive 0 4 6 8 Sending Rate (Mbps) 2 10 12 FCP Time 14 Sending Rate (Mbps) Sending Rate (Mbps) Fast Convergence/Accurate Feedback XCP Fairness AIMD 29 Local Stability of FCP • Random perturbation of [-30,30]% introduced every second for a duration of 100 ms. 30 Faster Convergence/Accurate Feedback Fan-out =5 … More results in the paper: Slower, inaccurate - Fairness, efficiency - Local stability - Effectiveness of preloading 100Mbps - Overhead of price calculation FCP’s converges faster and the feedback is more accurate. Slowest. Fairness AIMD 31 Budget Determines the Fair-share. 1 ($/sec) 1 ($/sec) Host A Host B 100Mbps … Fan-out =50 32 Local Stability of FCP • Random perturbation of [-30,30]% introduced every second for a duration of 100 ms. 33 Accurate feedback with preloading • Double the load every RTT from t= 0 to 1 sec. • RCP: double the # of flows • FCP (our design): double the budget 34 34 Accurate feedback with preloading • Double the load every RTT from t= 0 to 1 sec. • RCP: double the # of flows • FCP (our design): double the budget 35 End-point Flexibility: Topology Sender0 (S0) Budget 1 Round-trip latency : 12 ms 250 Mbps link Sender1 (S1) Budget 2 Receiver0 Top flow Btm flow Sender2 (S2) Budget 1 200 Mbps links Receiver1 50 Mbps link 36 Budget allocation is up to end-points S1’s Throughput (Mbps) Total 160 120 80 40 0 0 0.5 1 1.5 Budget of S1’s top flow ($/sec) 2 37 Budget allocation is up to end-points S1’s Throughput (Mbps) Total Top flow Bottom flow 160 120 Equal-budget Max-throughput 80 Equal-throughput 40 0 0 0.5 1 1.5 Budget of S1’s top flow ($/sec) 2 38 End-point Diversity: Background flows Flow0 50Mbps 100Mbps Link latency: 3ms Non-bottleneck Link capacity: 200Mbps Flow1 Flow2 (Background) Flow3 Flow2 (Background) 39 Diversity in Network Pricing • FCP can also support diverse behaviors in network price generation. • Examples (in the paper) – Deadline support [D3 SIGCOMM 11] – Aggregate resource allocation in a multi-tenant data-center – Stable bandwidth allocation for streaming – Multicast congestion control 40 Deadline Support [D3 SIGCOMM 11] Deadline flows Price preload to specify the desired rate Deadline Queue Preload Constant Pricing Time Price Best Effort Queue Average Incoming Budget Price ($/byte) = Remaining Link Capacity Variable Pricing Time 41 Price Differential Pricing for Deadline Support Constant Pricing for Deadline Flows Variable Pricing for BE flows 42 Price Differential Pricing for Deadline Support Variable Pricing for BE flows Constant Pricing for Deadline Flows Price increase Best effort (BE) Dead flows Rejected/BE flows 43 Packet processing performance Per-core FCP throughput : 2.6Mpps (1.25 Gbps @ 64 bytes) Per-core Click baseline: 6.1 Mpps (3.1 Gbps @64bytes) 44 Packet processing performance About 1/3 spent on actual price calculation 1/3 spent on clock_gettime (can be offloaded to h/w) 45 Summary • FCP accommodates diverse behaviors in resource allocation while utilizing explicit feedback. • FCP maximizes end-point’s flexibility by simplifying the mechanism of coexistence. • FCP’s explicit feed-back and feed-forward provides a generic interface for efficient resource allocation. 46 Research Theme • Cooperative model for resource allocation. • Network knows information that end-points do not know. • End-points know information that the network does not know. • Can we benefit from a new interface between the network and the end-points? 47 Any more examples? 48 Mobile Networks • Unique properties: – Under a single administrative domain – A complete ecosystem. 49 50 • Prof. Song 51 What is research? • People have their own style: “diversity” • What is your “uniqueness”? Understanding Tradeoffs in Incremental Deployment of New Network Architectures CoNEXT 2013 FCP: A Flexible Transport Framework for Accommodating Diversity Sigcomm 2013 CAMEO: A Middleware for Mobile Advertisement Delivery MobiSys 2013 Supporting Long Term Evolution in an Internet Architecture Ph.D. Thesis RPT: Re-architecting Loss Protection for Content-Aware Networks NSDI 2012 XIA: Efficient Support for Evolvable Internetworking NSDI 2012 52 Conclusion 53 54 Budget allocation is up to end-points S1’s Throughput (Mbps) Total Top flow Bottom flow 160 120 80 40 0 0 0.5 1 1.5 Budget of S1’s top flow ($/sec) 2 55 Budget allocation is up to end-points S1’s Throughput (Mbps) Total Top flow Bottom flow 160 120 80 Equal-throughput 40 0 0 0.5 1 1.5 Budget of S1’s top flow ($/sec) 2 56 Budget Determines the Fair-share. 1 ($/sec) 1 ($/sec) Host A Host B 100Mbps … Fan-out =50 57 Misbehaving Hosts Detection of misbehaving hosts: Expected incoming budget ≠ incoming budget Identification: Need per-host state. 58 Packet processing performance Per-core FCP throughput : 2.6Mpps (1.25 Gbps @ 64 bytes) Per-core Click baseline: 6.1 Mpps (3.1 Gbps @64bytes) 59 Packet processing performance About 1/3 spent on actual price calculation 1/3 spent on clock_gettime (can be offloaded to h/w) 60 Memory Overhead • Implementation stores 500 msec of history at 10 usec granularity = 50K entries Price ($/byte), Incoming Budget($) Time More recent 61 Local Stability of FCP • Random perturbation of [-30,30]% introduced every second for a duration of 100 ms. 62 Incremental Deployment TCP Legacy TCP link FCP TCP Queue FCP feedback TCP FCP link 1. Assign budget on aggregate TCP flows. 2. TCP flows are treated as a single FCP flow. 3. The mapped FCP flow’s rate determines the TCP queue’s drain rate. FCP segment 63 FCP and RCP 64 Short Flow’s Completion Time 65 Accurate feedback with preloading • Double the load every RTT from t= 0 to 1 sec. • RCP: double the # of flows • FCP (our design): double the budget 66 66 Accurate feedback with preloading • Double the load every RTT from t= 0 to 1 sec. • RCP: double the # of flows • FCP (our design): double the budget 67 Budget Determines the Fair-share. 1 ($/sec) 1 ($/sec) Host A Host B 100Mbps … Fan-out =50 68 Requirements for CC Diversity (1) 1) Flexible control at the end-points to handle new requirements. High-level Application Objectives High-level Application Objectives Maximize throughput with multiple paths Support deadlines [sigcomm 2011,2012] Support background flows Optimize for Quality of Experience [sigcomm2013] 69 Requirements for CC Diversity (2) 1) Flexible control at the end-points 2) Ensure coexistence (fairness and effciency) High-level Application Objectives Fairness and efficiency 70 Aggregate Control in a Multi-tenant Data-center Two tenants with the same weight sharing a rack. Each of their flow has equal budget. The network dynamically translates the value. 20 $/sec 1¥/sec VM1 Price feedback ($): 1 €/bit x exchange $/€ VM2 20 $/sec VM3 VM4 … VM39 VM40 Group 0 Group 1 ToR Switch 1 ¥/sec Weight 1 Weight 1 Link price: 1 €/bit Price feedback (¥): 1 €/bit x exchange ¥ /€ 20 Servers/Rack A rack with 20 servers 71 Aggregate Control in a Multi-tenant Data-center ToR Switch VM1 VM2 VM3 VM4 … VM39 VM40 20 Servers/Rack 72 2b Feed-forward: Preloading 1 ($/sec) 10 ($/sec) Sender PRICE Desired Budget Preload = -1 Current Budget Budget = $10/sec PRELOAD PRELOAD PRELOAD PRELOAD PRELOAD PRICE PRICE PRICE PRICE PRICE 73 2a Feedback: Pricing • Feedback: “congestion price” reflecting the “cost” of sending data across the link [Kelly] P3 Sender updates the rate. Rate = budget/price P2 PRICE DATA P1 Total Price = p1+p2+p3 PRICE ACK This implements proportional fairness [Kelly]. Price Echoback 74 Deadline Support [D3 SIGCOMM 11] 1) Flows specify the desired rate by preloading. 2) The network admits a flow if it is within its capacity, otherwise the flow gets a best effort service. – Two class queuing at the network (deadline, BE) – Deadline (BE) flows get fixed (variable) pricing. Best effort Admitted deadline flows Rejected deadline flows 75 Faster Convergence/Accurate Feedback … 100Mbps Fan-out =5 Overloaded when new flows arrive FCP’s converges faster and the feedback is more accurate. Fairness AIMD 76 2a Price Calculation Feedback: Pricing Router updates the price, upon packet reception. Instantaneous Incoming budget ($) = Price ($/byte) x Bytes Received (bytes) Averaging window = 2 x AvgRTT Incoming budget I(t)= ò price(t - rtt)bytes(t)dt Time Average Incoming Budget Price ($/byte) = Remaining Link Capacity 77