Low-Latency Adaptive Streaming Over TCP Author Ashvin Goal (University of Toronto) Charles Krasic (University of British Columbia) Jonathan Walpole (Portland State University) Presented By Kulkarni Ameya.s JongHwa Song Concept of Paper Benefits of TCP for Media streaming – Congestion control delivery Flow control Reliable delivery -Packet loss recovery Problem:- TCP introduces latency at application level Solution proposed:- adaptive buffer-size tuning technique Agenda First 2 sections talk about the ‘Challenge’ Next section analyses ways in which TCP introduces latency Adaptive send-buffer technique for reducing TCP latency Effect on throughput and Tradeoff – Network throughput and latency Implementation Real streaming application example Justification of Benefits The Challenge Application must adapt media quality in response to TCP’s estimate of current bandwidth availability Adaptive streaming applications - prioritized data dropping and dynamic rate shaping Application must make adaptation decisions far in advance of data transmission Result –adaptation is unresponsive and performs poorly as the available bandwidth varies over time TCP induced latency End to end latency consists of 1. Application level latency 2. Protocol latency TCP introduces protocol latency in 3 ways:1. Packet retransmission 2. Congestion control 3. Sender side buffering TCP congestion window Window size (CWND)= max number of unacknowledged and distinct packets in flight Send buffer keeps copies of packets in flight Throughput of TCP stream = CWND/RTT TCP induces latency in following ways:1. Packet retransmission – at least 1 RTT delay 2. Congestion control- at least 1.5 RTT delay 3. Sender side buffering Adaptive send buffer tuning Latency is introduced into the network because of blocked packets in the send buffer Reduce the send buffer size to CWND Send Buffer size should never be less than CWND Tune the send buffer size to follow the CWND Since the CWND changes dynamically over time this technique is called Adaptive send buffer tuning and TCP connection is MIN_BUF TCP flow MIN_BUF TCP Blocks an application from writing data to a socket when there are CWND packets in send buffer Application can write packet to socket only when an ack arrives and the CWND is opened up MIN_BUF TCP moves the latency due to blocked packets to the application level Application has much greater control in sending timecritical data Evaluation Forward path congestion topology Reverse path congestion topology Comparison of Latencies Other factors affecting latency Experiments performed with smaller bandwidth 10 Mb/s at the router Experiments performed with smaller round trip time (25 ms and 50 ms) Observation – 1. Latency increases with lesser available BW 2. Latency decreases with lower RTT Experiment with ECN ECN – External Congestion Notification ECN MIN_BUF TCP still performs better than TCP flows TCP with ECN but without MIN_BUF drops few packets but still suffers large delays because of blocked packets Thus, MIN_BUF TCP gives more control and flexibility to the application in terms of what data to be sent and when it should be sent Eg – high priority data Next – we will delve on TCP throughput EFFECT ON THROUGHPUT Consideration : Latency + Throughput Packet for the ACK arrival Standard TCP -> Send Immediately MIN_BUF TCP-> Wait until writing the next packet Solution : Adjusting the buffer size (slightly larger than CWND) Considering Event A. ACK Arrival Standard TCP When an ACK arrives for the first packet in TCP window, the window admits a new packet in the window. MIN_BUF TCP Buffer one additional packet to send immediately B. Delayed ACK Purpose : To save bandwidth in the reverse direction Function : One ACK for every two data packets received. Each ACK arrival opens TCP’s window by two packets MIN_BUF TCP Buffer two instead of one additional packets 5. EFFECT ON THROUGHPUT Considering Event C. CWND Increase TCP increments CWND by 1 on every round-trip time ACK arrival -> releasing two packets Delayed ACKs -> Three additional packets. Byte-counting Algorithm Mitigating the impact of delayed ACKs on the growth of CWND D. ACK Compression Function : At routers, ACK can arrive at the sender in a busty Worst Case : The ACKs for all the CWND packets arrive together Solution of MIN_BUF TCP 2* CWND packets(default send buffer size is large) 5. EFFECT ON THROUGHPUT Considering Event E. Dropped ACK When the reverse path is congested, ACK packets drops Later ACK -> acknowledge more than two packets Act similar to ACK Compression 5. EFFECT ON THROUGHPUT MIN_BUF TCP Streams A*CWND + B (A>0, B≥0) (A) Handling any bandwidth reduction caused by ACK compression (B) Taking ACK arrivals, delayed ACK arrivals, delayed ACKs and CWND increase into account (A ≥ 2, B≥1, TCP send & acknowledged packets is unaffected so Throughput between MIN_BUF TCP and TCP is comparable) Tradeoff of A and B => latency and throughput (every additional block packet -> increase latency) MIN_BUF(A,B) ( default size : MIN_BUF(1,0) ) 5. EFFECT ON THROUGHPUT Evaluation MIN_BUF(1,0) -> original stream MIN_BUF(1,3) -> take ACK arrivals, delayed ACKs CWND increase into account MIN_BUF(2,0) -> take ACK compression and dropped ACKs 5. EFFECT ON THROUGHPUT Evaluation X- axis : protocol latency in milliseconds Y- axis : percentage of packets that arrive at the receiver within a delay threshold 5. EFFECT ON THROUGHPUT Evaluation 160 500 160 ms threshold : Requirement of interactive application such as video conferencing 500 ms threshold : Requirement of media control operations Each experiment performs 8 times & latencies accumulated over all the runs 5. EFFECT ON THROUGHPUT Forward path topology 160 ms threshold MIN-BUF(1,0) MIN-BUF(1,3) :less than 2% MIN-BUF(2,0) :10% TCP :30% 5. EFFECT ON THROUGHPUT Reverse path topology Acknowledgement ->Drops Slightly increase delay 160 ms threshold MIN-BUF(1,0) MIN-BUF(1,3) :less than 10% TCP :40% 5. EFFECT ON THROUGHPUT Normalized Throughput MIN-BUF(2,0) : Close to Std TCP MIN-BUF(1,0) : Least Throughput TCP has no new packets in the send buffer after each ACK is received MIN-BUF(1,3) : 95% -> Achieving low latency and good throughput 5. EFFECT ON THROUGHPUT System Overhead Write Data to the Kernel : High system overhead because more system calls are invoked to transfer the same amount of data. Write (MIN_BUF(1,0) slightly over) MIN_BUF TCP one packet at a time TCP several packets at a time TCP amortized context switching overhead 5. EFFECT ON THROUGHPUT System Overhead Poll calls (MIN_BUF(1,0) significantly more overhead) Standard TCP poll calls after every 14 writes MIN_BUF(1,0) called everytime Ratio between two is 12.66 5. EFFECT ON THROUGHPUT System Overhead Total CPU time (MIN_BUF(1,0) three times) As a result of fine-grained write To redece overhead -> Larger values of the MIN_BUF parameters 5. EFFECT ON THROUGHPUT IMPLEMENTATION MIN BUF TCP approach with a small modification in the Linux 2.4 kernal : Using a new SO_TCP_MIN BUF option Limit send buffer A∗CWND+MIN(B, CWND)segments (segments are packets of maximum segment size or MSS) The send buffer size is at least CWND because A must be an integer greater than zero and B is zero or larger Default A is one and B is zero IMPLEMENTATION A. Sack Correction Term “sacked out” to A ∗ CWND + MIN(B, CWND). The sacked out term is maintained by a TCP SACK sender and is the number of selectively acknowledged packets. To ensure that the send buffer limit includes this window and is thus at least CWND+sacked out. Without this correction,TCP SACK is unable to send new packets for a MIN_BUF flow and assumes that the flow is application limited 6. IMPLEMENTATION IMPLEMENTATION B. Alternate Application-Level Implementation The application would stop writing data when the socket buffer has a fill level of packets. The problem with this approach is that the application has to poll the socket fill level. Polling is potentially both expensive in terms of CPU consumption and inaccurate since the application is not informed immediately when the socket-fill level goes below the threshold 6. IMPLEMENTATION IMPLEMENTATION C. Application Model MIN BUF TCP should explicitly align their data Two benefits: (1) it minimizes any latency due to coalescing or fragmenting of packets below the application layer. (2) it ensures that low-latency applications are aware of the latency cost and throughput overhead of coalescing or fragmenting application data into network packets. For alignment, an application should write maximum segment size (MSS) packets on each write TCP CORK socket option in Linux improves throughput &not affect protocol latency 6. IMPLEMENTATION Application-LEVEL Evaluation Evaluate timing behavior of a real live streaming MIN_BUF TCP helps improving end to end latency Qstream (open-source adaptive streaming application) - Adaptive media format - Adaptation mechanism Application-LEVEL Evaluation Adaptive media format : SPEQ(scalable MPEQ) Variant of MPEG-1 that supports layered encoding of video data that allows dynamic data dropping Adaptation mechanism : PSS(priority-progress streaming) The key idea is an adaptation period, which determines how often the sender drops data. Within each adaptation period, the sender sends data packets in priority order, from the highest priority to the lowest priority 7. Application level evaluation Evaluation Methodology 7. Application level evaluation Result 7. Application level evaluation Result 7. Application level evaluation CONCLUSIONS Tuning TCP’s send buffer : low latency streaming over TCP : show the significant effect on TCP of latency at the application level Extra Packet(Blocked Packet) : Help to recover throughput without increasing protocol latency Used layered media encoding for evaluation : reducing end-to-end latency and variation in media quality Questions