Network Research and Linux at the Hamilton Institute, NUIM. David Malone

advertisement
Network Research and Linux at the
Hamilton Institute, NUIM.
David Malone
4 November 2006
1
What has TCP ever
done for Us?
• Demuxes applications (using port numbers).
• Makes sure lost data is retransmitted.
• Delivers data to application in order.
• Engages in congestion control.
• Allows OOB data.
• Some weird stuff with TCP options.
2
Standard Picture of TCP
Connector
Connector
Listener
No Listener
SYN
SYN
RST
SYN+ACK
ACK
ACKs
Data
FIN
High Port
(Usually)
FIN
High Port
(Usually)
Known Port
(Usually)
3
Known Port
(Usually)
Other Views of TCP
Packets In
Data In
Socket
Magic
Socket
Data Out
Packets Out
(at link rate)
Programmer’s View
4
Network View
Congestion Control
• TCP controls number of packets in network.
• Packets are acknowledged, so flow of ACKs.
• Receiver advertises window to avoid overflow.
• Congestion window tries to adapt to network.
• Slow start to roughly find capacity.
• Congestion avoidance gradually adapts.
5
The Congestion Window
k'th congestion
event
wi
w i(k+1)
w i(k)
t a (k)
tb (k)
t c (k)
Time (RTT)
k'th congestion epoch
• Additive increase, multiplicative decrease (AIMD).
• To fill link need to reach BW × τ .
• Backoff by 1/2, implies buffer is BW × τ .
• Fairness, responsiveness, stability, . . .
6
TCP On Linux
• Network stack buffers in-flight data.
• Socket buffer must be BW × τ .
• /proc/net/core/{r,w}mem_max → sockbuf sizes.
• /proc/net/ipv4/tcp_{r,w,}mem → min/def/max
tcp window.
• Trade off — memory is wired, so valuable.
• Defaults have recently been increased.
7
Research Work
• Packet loss not caused by congestion.
• Filling big BW × τ product packet at a time.
• Bad for long-distance high-bandwidth links.
• Various solutions in pipeline (BIC, Scalable,
High-Speed, FAST, H-TCP).
• Pluggable congestion control in Linux (behind
TCP_CONG_ADVANCED).
• /proc/sys/net/ipv4/tcp_congestion_control
• Working on other congestion detection techniques.
8
Throughput
24000
24000
Throughput Flow 1
Throughput Flow 2
Throughput Flow 1
Throughput Flow 2
22000
22000
20000
20000
18000
16000
Throughput (KBytes)
Throughput (KBytes)
18000
14000
12000
10000
16000
14000
12000
10000
8000
8000
6000
6000
4000
2000
4000
0
200
400
600
800
1000
0
time (sec)
200
400
600
time (sec)
9
800
1000
Cwnd
1200
1200
Cwnd flow 1
Cwnd flow 2
1000
1000
800
800
Cwnd (packets)
Cwnd (packets)
Cwnd flow 1
Cwnd flow 2
600
600
400
400
200
200
0
0
0
200
400
600
800
1000
0
time (sec)
200
400
600
time (sec)
10
800
1000
Practical Stuff
• Other issues at play, such as implementation quality.
• For example ACK processing and queueing problems.
• Testing is important: land speed records.
• Project with OSDL to build validation suite.
11
Before
25000
4.5e+09
snd_cwnd
snd_nxt
throttle
slow path
4e+09
20000
1000
3.5e+09
2.5e+09
2e+09
10000
Distribution
3e+09
15000
Bytes (sequence)
Segments (cwnd, ssthresh)
10000
100
10
1.5e+09
1e+09
5000
1
5e+08
0
0
100
200
300
400
500
600
0
700
0.1
0.1
time (seconds)
1
10
100
microsecond
12
1000
10000
After
10000
slow path
4e+09
1000
3.5e+09
3e+09
15000
2.5e+09
2e+09
10000
Distribution
20000
Segments (cwnd, ssthresh)
4.5e+09
snd_nxt
snd_una
snd_cwnd
snd_ssthresh
qlen
throttle
Bytes (sequence)
25000
100
10
1.5e+09
1e+09
5000
1
5e+08
0
0
100
200
300
400
500
600
0
700
0.1
0.1
time (seconds)
1
10
100
microsecond
13
1000
10000
802.11(e) MAC
Summary
• After TX choose rand(0, CW − 1).
• Wait until medium idle for DIFS(50µs),
• While idle count down in slots (20µs).
• TX when counter gets to 0, ACK after SIFS (10µs).
• If ACK then CW = CWmin else CW∗ = 2.
Ideally produces even distribution of packet TX.
In 11e have multiple queues. Each has own CWmin ,
DIFS(aka AIFS) and can have TXOP.
14
Testbed setup
Multiple STA (Linux) connected to AP (Linux hostap).
Hardware
model
1× AP
Desktop PC
18× STA
Soekris
1× STA
Desktop PC
WLAN NIC Atheros AR5212
External antenna, PCI interface, Madwifi driver with
local patches for 11e parameter setting.
15
16
Validation
Relative Throughput against TXOP parameter
Relative Throughput against AIFS parameter
5.5
30
Analytical Model
Experimental Results
5
25
4
Relative Throughput
Relative Throughput
4.5
3.5
3
2.5
20
15
10
2
5
1.5
1
0
1000
2000
3000
4000
TXOP in unit of us
5000
6000
0
7000
Relative Throughput against CWmin
Analytical Model
Experimental Results
16
Relative Throughput
14
12
10
8
6
4
2
0
−3
−2
−1
0
1
2
3
CWmin Shift (0 corresponds to the default CWmin 32)
5
10
15
AIFS in unit of 20us timeslot
20
25
Measure relative performance of two saturated
flows
while
varying
TXOP,
AIFS
and
CWmin .
Compare to
well-known models.
20
18
0
4
17
Before
12 TCP Uploads to AP for 180 sec
6 Uploads and 6 Downloads for 180 sec
1200
1200
1000
800
Throughput(kbits/sec)
Throughput(kbits/sec)
1000
600
400
200
0
800
600
400
200
1
2
3
4
5
6
7
8
STA id
9
10
11
0
12
18
1
2
3
4
5
6
7
8
STA id
9
10
11
12
After
12 TCP Uploads to AP for 180 sec
6 Uploads and 6 Downloads
600
700
600
500
Throughput(kbits/sec)
Throughput(kbits/sec)
500
400
300
200
400
300
200
100
0
100
1
2
3
4
5
6
7
8
STA id
9
10
11
0
12
19
1
2
3
4
5
6
7
8
STA id
9
10
11
12
Unprioritised Voice
Throughput
Delay
70
30000
60
Average Delay (seconds x 10-6)
25000
50
Throughput (kbps)
Unpriotised Station
Delay bound for queue stability
40
30
20
20000
15000
10000
5000
10
Unpriotised Station
Ideal Throughput
90% Throughput
0
0
2
4
6
0
8
10
12
14
16
18
Number of competing stations
0
2
4
6
8
10
12
Number of competing stations
20
14
16
18
Measuring Delay
• Want to measure one-way MAC delay.
• NTP slow and insufficiently accurate.
• Simultaneously observable TX better, largish noise.
0.1
station 1
station 2
station 3
station 4
Offset (s)
0.05
0
-0.05
-0.1
0
500
1000
1500
Time (s)
21
2000
2500
Delay Technique
• Transmission not complete until MAC ACK.
• Hardware supports interrupt after ACK.
Interface TX
Queue
1. Driver notes
enqueue time.
Driver
4. Driver notes
completion time.
Driver TX
Discriptor
Queue
2. Hardware
contends until
ACK received
Packet transmitted
Hardware
ACK received
22
3. Hardware
interrupts
driver.
Validation
1600
Median delay (seconds x 10-6)
1400
1200
1000
800
600
400
Long preamble
Fit line: 11.0012Mbps x + 484.628us
Short preamble
Fit line: 11.0016Mbps x + 292.565us
200
0
0
200
400
600
800
1000
1200
Packet size (bytes, excluding headers)
23
1400
0.025
0.16
0.14
0.02
Fraction of total packets
Fraction of Packets
0.12
0.015
0.01
0.1
0.08
0.06
0.04
0.005
0.02
0
900
1000
1100
1200
1300
1400
1500
1600
0
900
0.025
0.25
0.02
0.2
Fraction of Packets
Fraction of total packets
Delay (seconds x 10-6)
0.015
0.01
0.005
1000
1100
1200
1300
Delay (seconds x 10-6)
1400
1500
1600
0.15
0.1
0.05
0
900
1000
1100
1200
1300
1400
1500
1600
1700
0
800
1800
Delay (seconds x 10-6)
900
1000
1100
1200
1300
Delay (seconds x 10-6)
24
1400
1500
1600
AIFS Impact
Throughput
Delay
70
30000
60
Average Delay (seconds x 10-6)
25000
50
Throughput (kbps)
Unpriotised Station
Competitors AIFS = 4
Competitors AIFS = 6
Delay bound for queue stability
40
30
20
Unpriotised Station
Competitors AIFS = 4
Competitors AIFS = 6
Ideal Throughput
90% Throughput
10
0
0
2
4
6
20000
15000
10000
5000
0
8
10
12
14
16
18
Number of competing stations
0
2
4
6
8
10
12
Number of competing stations
25
14
16
18
Download