Investigating Spurious TCP Timeouts in 802.11 David Malone, Douglas J. Leith, Anshuman Aggarwal, Ian Dangerfield 17 June 2008 1 TCP over WLAN Woes 12 uploads fixed 12 uploads 12 TCP Uploads to AP for 180 sec 12 TCP Uploads to AP for 180 sec 1200 600 500 800 Throughput(kbits/sec) Throughput(kbits/sec) 1000 600 400 200 0 400 300 200 100 1 2 3 4 5 6 7 8 STA id 9 10 11 0 12 2 1 2 3 4 5 6 7 8 STA id 9 10 11 12 Spurious Timeouts? • Believed that spurious timeouts a problem. • Once AP fixed, don’t see many problems. • Fully understand out fix. • Investigate if there are spurious timeouts. • What sort of tweaking needed? • Office/hotspot usage. 3 Previous Work • Gurtov (01): delay spikes (radio coverage, handover, prio). • Korhonen (01): Problems in early GSM/GPRS/. . . • Fotiadis (05): Test delay spikes in 802.11. • Vacirca (06) and Ricciato (07): Modern GPRS/3G. Fixes: drive up RTO, Eifel Algorithm, F-RTO. 4 Testbed STAs AP Dummynet Uploads, WLAN Buffer 20+19, retries 11, fixed rate, long preamble, fast ACKs. 5 Busted Case: 4 Uploads 120000 700 Observed RTT for monitored STA rto Observed RTT for monitored STA rto 600 100000 500 RTT (ms) RTT (ms) 80000 60000 400 300 40000 200 20000 100 0 0 0 20 40 60 time (s) 80 100 120 0 20 40 60 time (s) RT O = srtt + max(200ms, 4rttvar) Basically no MAC losses. 6 80 100 120 Busted Case: AP Queue 20 20 Observed Driver Queue Observed Driver Queue 18 18 14 AP Driver Queue Occupancy AP Driver Queue Occupancy 16 12 10 8 6 16 14 12 4 10 2 0 8 0 20 40 60 time (s) 80 100 120 0 Lost whole flights of ACKs. Hard for Eifel/F-RTO to help. 7 20 40 60 time (s) 80 100 120 Fixed Case 500 18 Observed RTT for monitored STA rto Observed Driver Queue 450 16 400 14 AP Driver Queue Occupancy 350 RTT (ms) 300 250 200 150 12 10 8 6 4 100 2 50 0 0 0 20 40 60 time (s) 80 100 120 0 20 40 60 time (s) 80 100 120 No timeouts, still basically no MAC losses. Less variable? 8 Clearance 0.045 priotised overhead unpriotised overhead 0.04 0.035 Observed Frequency 0.03 0.025 0.02 0.015 0.01 0.005 0 100 150 200 250 Clearance (ms) 300 350 Both safely positive. Trade-off between STA and AP draintimes. 9 400 noprior prior 160 10 drain time scaled queue size drain time scaled queue size 9 140 8 120 7 drain time (ms) drain time (ms) 100 80 60 6 5 4 3 40 2 20 AP 1 0 0 0 5 10 15 20 0 5 time (s) 10 15 20 time (s) 200 300 drain time scaled queue size drain time scaled queue size 180 250 160 140 drain time (ms) drain time (ms) 200 120 100 80 150 100 60 40 50 20 STA 0 0 0 5 10 15 20 time (s) 0 5 10 time (s) 10 15 20 Trying Harder • Maybe 4 uploads not hard enough? • Other sources of RTT varience? • Mobility. • Varying number of users. • Varying physical rate. 11 More users Change from 4 to 8, 12 and 16. 3000 3000 3000 Observed RTT for monitored STA rto Observed RTT for monitored STA rto 2500 2500 2000 2000 2000 1500 RTT (ms) 2500 RTT (ms) RTT (ms) Observed RTT for monitored STA rto 1500 1500 1000 1000 1000 500 500 500 0 100 120 140 160 time (s) 180 200 0 100 120 140 160 time (s) 180 200 0 100 120 140 160 180 200 time (s) RTTs increase: 200ms, 400ms, 600ms and 900–1000ms. Collision prob also rising: 18%, 35%, 47% and 54%. 12 More users: close up 3000 3000 3000 Observed RTT for monitored STA rto Observed RTT for monitored STA rto 2500 2500 2000 2000 2000 1500 RTT (ms) 2500 RTT (ms) RTT (ms) Observed RTT for monitored STA rto 1500 1500 1000 1000 1000 500 500 500 0 120 122 124 126 time (s) 128 130 0 120 122 124 126 128 130 0 120 122 time (s) No timeouts (but one failed fast retransmit). 13 124 126 time (s) 128 130 Harsh Rate Control Change from 11 to 5.5, 2 and 1. 3000 3000 3000 Observed RTT for monitored STA rto Observed RTT for monitored STA rto 2500 2500 2000 2000 2000 1500 RTT (ms) 2500 RTT (ms) RTT (ms) Observed RTT for monitored STA rto 1500 1500 1000 1000 1000 500 500 500 0 100 120 140 160 time (s) 180 200 0 100 120 140 160 180 200 0 100 time (s) Sharp increase, in line with expectations. 14 120 140 160 time (s) 180 200 Close Up 3000 3000 3000 Observed RTT for monitored STA rto Observed RTT for monitored STA rto 2500 2500 2000 2000 2000 1500 RTT (ms) 2500 RTT (ms) RTT (ms) Observed RTT for monitored STA rto 1500 1500 1000 1000 1000 500 500 500 0 120 122 124 126 time (s) 128 130 0 120 122 124 126 128 130 0 120 time (s) No timeouts even though RTT quite long. 15 122 124 126 time (s) 128 130 Why Uploads? 12 uploads fixed 12 uploads 6 Uploads and 6 Downloads for 180 sec 6 Uploads and 6 Downloads 1200 700 600 1000 Throughput(kbits/sec) Throughput(kbits/sec) 500 800 600 400 400 300 200 200 0 100 1 2 3 4 5 6 7 8 STA id 9 10 11 0 12 16 1 2 3 4 5 6 7 8 STA id 9 10 11 12 Conclusions • Didn’t find spurious timeouts. (even after trying hard.) • Need to check up/down and request/response. • Buffer size interesting. • ARP over busy WLAN busted. 17