インタネット工学第4回プレゼン資料

advertisement
トランスポートレイヤ技術
- TCP; Transmission Control Protocol (1) TCPの基本動作
(2) TCP Interactive Data Flow
(3) TCP Bulk Data Flow
(4) TCP Data Retransmission
(5) TCP Persist Timer
(6) TCP Keep Alive Timer
(7) その他・将来
1
ping telnet ftp
X
traceroute
tftp bootp smtp
TCP
ICMP
NFS/RPC
UDP
IP
IGMP
Data-Link
Ethernet
FDDI
SDH
FR
ATM
2
インターネットアーキテクチャ
- TCP : Transmission Control Protocol • TCP (Transmission Control Protocol) ; end-to-end
– フロー制御
– エラー制御 / 再送制御
– コネクション管理
– セッションの多重化
Application
Application
TCP
TCP
IP
IP
IP
Network
Interface
Network
Interface
Network
Interface
Physical
Physical
Physical
3
トランスポートレイヤ技術
- TCP; Transmission Control Protocol (1) TCPの基本動作
(2) TCP Interactive Data Flow
(3) TCP Bulk Data Flow
(4) TCP Data Retransmission
(5) TCP Persist Timer
(6) TCP Keep Alive Timer
(7) その他・将来
4
TCP Header Format
0
0
78
15 16
23 24
31
source port identifier
destination port identifier
20 Bytes
1
sequence number
2
3
ACK number
5
6
Offset(4)
Rsrvd(6)
control bits
UR
AK
PH
RT
SY
FN
4
checksum
window size
Urgent Pointer
Option
Padding
5
TCP Header Format
UR
AK
PH
RT
SY
FN
: Urgent Pointer Field Significant (URG)
: Acknowledgement Field Significant (ACK)
: Push Function
: Reset the Connection
: Synchronize Sequence Numbers (SYN)
: No More Data From Sender (FIN)
6
TCP Features
・ “Stream” Oriented Data Transmission
→ Connection確立(Three-way-handshake)
・ Connection (“Stream”) Identifier = “Socket”
{dst_IP_addr, dst_port, src_IP_addr, src_port}
・ “Sequence Number” ; 32 bits
→ バイト番号 : 0 − (2^32-1)
→ 2^32 でSequence NumberがWrapされる
・ “Full-Duplex”での通信
・ Acknowledgement (ACK) ;
→ 次に受信すべきバイト番号(SN)の通知
・ エラー回復: セグメント再送(Segment retransmission)
by Time-out, Dupilicated-ACK
・ “Sliding Window Control” を用いたデータ転送制御
(*) Window_size ≦ 65,535 Bytes
7
TCP Port Allocation (RFC1700)
1. Well-Known Ports
;
0 - 1,023
2. Registered Ports
; 1,024 - 49,151
3. Dynamic and/or Private Ports ; 49,152 - 65,535
最新情報 :
ftp://ftp.isi.edu/in-notes/iana/assignments/port-numbers
8
TCP Well-Known Ports
Port Number Keyword
5
20
21
23
25
39
53
63
67
69
70
79
80
110
111
119
rje
ftp-data
ftp
telnet
smtp
rlp
domain
whois++
bootp
tftp
gopher
finger
http
pop3
sunrpc
nntp
Application
Remote Job Entry
File Transfer [Default data]
File Transfer [Control]
Telnet
Simple Management Protocol
Resource Location Protocol
Domain Name Server
Whois++
Bootstrap Protocol Server
Trivial File Transfer
Gopher
Finger
World Wide Web HTTP
Post Office Protocol - Version 3
SUN Remote Procedure Call
Network News Transfer Protocol
9
TCP Well-Known Ports
Port Number Keyword
123
137
138
139
179
202
213
220
396
540
546
547
560
ntp
netbios-ns
netbios-dgm
netbios-ssn
bgp
at-nbp
ipx
imap3
netware-ip
uucp
dhcpv6-client
dhcpv6-server
rmonitor
Application
Network Time Protocol
NetBIOS Name Service
NetBIOS Datagram Service
NetBIOS Session Service
Border Gateway Protocol (BGP)
AppleTalk Name Binding Protocol
IPX
IMAP3 (Interactive Mail Access Protocol)
Novell Netware over IP
uucp daemon
DHCPv6 Client
DHCPv6 Server
remote monitor daemon
10
TCP Connection確立/開放
Log on the console;
svr4% telnet bsdi discard
Trying 140.252.13.35
Connected to bsdi.
Escape character is ‘^]’.
^]
telnet> quit
Connection closed.
#port=“9” (server discard packet)
tcpdump output
1 0.0
2 0.024 (0.0024)
3 0.007 (0.0048)
4 4.155 (4.1482)
5 4.158 (0.0013)
6 4.159 (0.0014)
7 4.189 (0.0225)
svr4.1037 > bsdi.discard: S 14155.14155(0)
win 4096 <mss 1024>
bsdi.discard > svr4.1037: S 18239.18239(0)
ack 14156 win 4096 <mss 1024>
svr4.1037 > bsdi.discard: . ack 18240 win 4096
svr4.1037 > bsdi.discard: F 14156:14156(0)
ack 18240 win 4096
bsdi.discard > svr4.1037: . ack 14157 win 4096
bsdi.discard > svr4.1037: F 18240.18240(0)
ack 14157 win 4096
svr4.1037 > bsdi.discard: . ack 18241 win 4096
TCP
Connection確立/開放
tcpdump output
1 0.0
2 0.024 (0.0024)
3 0.007 (0.0048)
4 4.155 (4.1482)
5 4.158 (0.0013)
6 4.159 (0.0014)
7 4.189 (0.0225)
svr4.1037 > bsdi.discard: S 14155.14155(0)
win 4096 <mss 1024>
bsdi.discard > svr4.1037: S 18239.18239(0)
ack 14156 win 4096 <mss 1024>
svr4.1037 > bsdi.discard: . ack 18240 win 4096
svr4.1037 > bsdi.discard: F 14156:14156(0)
ack 18240 win 4096
bsdi.discard > svr4.1037: . ack 14157 win 4096
bsdi.discard > svr4.1037: F 18240.18240(0)
ack 14157 win 4096
svr4.1037 > bsdi.discard: . ack 18241 win 4096
[意味]
source.port > destination.port : flags SN_begin.SN_end(data_size)
flags : S = SYN ; Synchronize sequence_number(SN)
F = FIN ; Finish data transmission
R = RST ; Reset connection
P = PSH ; push data to receiving process asap
. =
; none of above four flags is on
SN_end = SN_begin + data_size
win 4096 ; window size is 4096
mss 1024 ; maximum segment size is 1024 bytes
12
TCP Connection確立/開放
svr4.1037 (client)
bsdi.discard(server)
segment 1
SYN 14155.14155(0)
SYN 18239.18239(0)
ACK 14156
segment 3
(18239+1)
“次に受信すべきSN”
segment 4
ACK 18240
FIN 14156:14156(0)
ACK 18240
segment 5 (14156+1)
“次に受信すべきSN”
segment 6
ACK 14157
FIN 18240.18240(0)
segment 7
(18240+1)
“次に受信すべきSN”
segment 2 (14155+1)
“次に受信すべきSN”
ACK 14157
ACK 18241
13
TCP Connection確立/開放
svr4.1037 (client)
“Active open”
SYN (a)
bsdi.discard(server)
(appli. open : telnet)
SYN_ACK(a+1,b)
“open”
“Passive open”
ACK(b+1)
“open”
“Active Close”
(application close: quit)
FIN (m,s)
EOF to Application
ACK (m+1)
“half close”
“Passive Close”
FIN_ACK (m+1,s)
ACK (s+1)
(application close)
“half close”
→ full close 14
CLOSED
appl: passive open
send: <nothing>
appl: active open
send: SYN
LISTEN
recvl: SYN
send: SYN, ACK
appl: close
send: FIN
passive open
appl: send data
Send : RST
send: SYN
Active open
recv: SYN
SYN_RCVD
SYN_SENT appl: close
send: SYN,ACK
or timeout
(simultaneous open)
recv: SYN,ACK
recv: ACK
send: ACK
send: <nothing>
appl: close
send: FIN
FIN_WAIT_1
recv: ACK
send: <nothing>
ESTABLISHED
recv: FIN
send: ACK
recv: FIN,ACK
send: ACK
FIN_WAIT_2
recv: FIN
send: ACK
Active close
recv: FIN
send: ACK
simultaneous close
CLOSE_WAIT
appl: close
send: FIN
CLOSING
recv: ACK
send: <nothing>
TIME_WAIT
2 MSL timeout
LAST_ACK
Passive close
recv: ACK
send: <nothing>
TCP Layer Interfaces
開始
指示
データ
送信
OPEN
SEND
Session
確立
TCP
IP
データ
送信
データ
受信
Session
情報
RECEIVE STATUS
データ
受信
異常終了
指示
正常終了
指示
ABORT CLOSE
Session
廃棄
Session
開放
Send (Service_Type, TTL, 擬似ヘッダ)
Recieve
send
receive
16
TCP Layer Interfaces
(1) OPEN Call :
機能 ; コネクション開始の指示
引数 ; Local_port, Destination_socket, Open_Mode(Active/Passive),
[timeout_value]、[Priority], [security], [Options]
戻り値 ; Local_Connection_Name
TCP動作 ; LISTEN(Passive_Open)、 ESTABLISHED(Active_Open)
System calls; - socket(pf, type, protocol)
- bind(socket, localaddr, adddrlen)
- connect(socket, destaddr, addrlen)
(2) SEND Call :
機能 ; データの送信指示
引数 ; Local_Connection_Name, 送信データバッファアドレス、
送信データバイト数, [PUSH], [URG], [再送タイムアウト値]
戻り値 ; なし
TCP動作 ; データ送信(ESTASBLISHED)
System calls ; write(socket, buffer, length)
send(socket, message, length, flags)
17
TCP Layer Interfaces
(3) RECEIVE Call :
機能 ; データの受信指示
引数 ; Local_Connection_Name, 受信データバッファアドレス、
受信データバイト数
戻り値 ; 受信バイト数, URG, PUSH, [受信バッファアドレス]
TCP動作 ; データ受信格納 (ESTASBLISHED)
System calls ; read(descriptor, buffer, length)
recvfrom(socket, buffer, flags, fromaddr, addlen)
recvmsg(socket, messagestruct, flags)
(4) STATUS Call :
機能 ; コネクション状態の取得を指示
引数 ; Local_Connection_Name
戻り値 ; Local_Socket, Destination_Socket, Local_Conenction_Name,
受信window_size, 送信window_size, Connection_state,
ACK待ちバッファ数, 未受信バッファ数, URG, Priority, Security,
送信タイムアウト値
18
TCP Layer Interfaces
(5) ABORT Call :
機能 ; コネクションの異常終了指示
引数 ; Local_Connection_Name
戻り値 ; なし
TCP動作 ; コネクション廃棄 (CLOSED)
(6) CLOSE Call :
機能 ; コネクションの正常終了指示
引数 ; Local_Connection_Name
戻り値 ; なし
TCP動作 ; コネクション開放 (CLOSED)
System call ; close(socket)
19
トランスポートレイヤ技術
- TCP; Transmission Control Protocol (1) TCPの基本動作
(2) TCP Interactive Data Flow
(3) TCP Bulk Data Flow
(4) TCP Data Retransmission
(5) TCP Persist Timer
(6) TCP Keep Alive Timer
(7) その他・将来
20
TCP Interactive Data Flow
- Default and Basic Procedure Telnet client
key-stroke “d”
Telnet server
data byte
ack of data byte
echo of data byte
echo to display
process “d”
to telnet server
process “d”
echo from telnet
server process “d”
ack of echo data byte
21
TCP Interactive Data Flow
- Delayed ACK Telnet client
key-stroke “d”
Telnet server
data byte
to telnet server
process “d”
ack of data byte
echo of data byte
Aggregate message
→ Delayed ACK
echo to display
process “d”
echo from telnet
server process “d”
ack of echo data byte
22
TCP Interactive Data Flow
- Delayed ACK : Piggy-Back Telnet client
Telnet server
echo of data byte
+ ack of data byte
delay window
delay window
data byte
echo from telnet
server process “d”
ack of echo data byte
23
TCP Interactive Data Flow
<Client>
<Server>
date¥n (6 bytes) => Sat Feb 6 07:52:17 MST 1993¥n (30 bytes)
1
2
“d”
3
4
“a”
5
6
7
8
“t”
9
10
“e” 11
12
13
“¥n”
14
→“CR/LF” 15
16
17
“svr4 % “ 18
19
0.0
0.016497
0.139955
0.458037
0.474386
0.539943
0.814582
0.831108
0.940112
1.191287
1.207701
1.339994
1.680646
1.697977
1.739974
1.799841
1.940176
1.944338
2.140110
(0.0165)
(0.1235)
(0.3181)
(0.0163)
(0.0656)
(0.2746)
(0.0165)
(0.1090)
(0.2512)
(0.0164)
(0.1323)
(0.3407)
(0.0173)
(0.0420)
(0.0599)
(0.1403)
(0.0042)
(0.1958)
bsdi.1023 > svr4.login:
svr4.login > bsdi.1023:
bsdi.1023 > svr4.login:
bsdi.1023 > svr4.login:
svr4.login > bsdi.1023:
bsdi.1023 > svr4.login:
bsdi.1023 > svr4.login:
svr4.login > bsdi.1023:
bsdi.1023 > svr4.login:
bsdi.1023 > svr4.login:
svr4.login > bsdi.1023:
bsdi.1023 > svr4.login:
bsdi.1023 > svr4.login:
svr4.login > bsdi.1023:
bsdi.1023 > svr4.login:
svr4.login > bsdi.1023:
bsdi.1023 > svr4.login:
svr4.login > bsdi.1023:
bsdi.1023 > svr4.login:
P
P
.
P
P
.
P
P
.
P
P
.
P
P
.
P
.
P
.
0:1(1) ack 1
1:2(1) ack 1
ack 2
1:2(1) ack 2
2:3(1) ack 2
ack 3
2:3(1) ack 3
3:4(1) ack 3
ack 4
3:4(1) ack 4
4:5(1) ack 4
ack 5
4:5(1) ack 5
5:7(2) ack 5
ack 7
7:37(30) ack 5
ack 37
37:44(7) ack 5
ack 44
24
bsdi.1023
D-ACK
D-ACK
D-ACK
D-ACK
D-ACK
D-ACK
1
svr4.login
PSH 0:1(1) ack 1 (d)
2
PSH 1:2(1) ack 1 (echo d)
3
4
PSH 1:2(1) ack 2 (a)
5
PSH 2:3(1) ack 2 (echo a)
6
7
ack 3
PSH 2:3(1) ack 3 (t)
8
PSH 3:4(1) ack 3 (echo t)
9
10
Delayed ACKによる
メッセージのAggregate
ack 2
ack 4
PSH 3:4(1) ack 4 (e)
PSH 4:5(1) ack 4 (echo e)
11
ack 5
12
13
PSH 4:5(1) ack 5 (¥n)
15
PSH 5:7(2) ack 5 (echo CR/LF)
ack 7
PSH 7:37(30) ack 5 (date内容)
14
17
ack 37
19
PSH 37:44(7) ack 5 (echo svr4%)
ack 44
16
18
・ D-ACK ; 200 msec
・ Piggy-back ;
→ echo + ack
・ segment 13 ; 1 byte data
“¥n”
segment 14 ; 2 byte date
“CR/LF”
TCP Negle Algorithm
“Large RTT”
(Sender) data packet flow
“¥n” “e” “t”
“a”
(Receiver)
“d”
echo & ack packet flow
aggregate
payload
“te¥n”
(e.g., ack_of_”d” & echo “d”)
“¥n”
+
“e”
+
“t”
: IP (20B)
: TCP(20B)
: Data
26
Negle Algorithm
Telnet client
key-stroke
(sn=1) “c” 1
(sn=2) “a”
(sn=3) “t”
ack 2 → OK sn=1
“a”+”t” → “at”
3
Telnet server
PSH 1:2(1) ack 2
2
PSH 2:3(1) ack 2
PSH 2:4(2) ack 3
4 “at” (sn=3,4)
PSH 3:5(2) ack 4
ack 4 → “sn=4”を期待
(sn=3まで受信)
5
echo;
“c” (sn=2)
ack 5
ack 5
→ “sn=5”を期待
(sn=4まで受信)
27
Disable Negle Algorithm
Telnet client
F1 key
(sn=1) “ESC” 1
2
(sn=2) “[”
(sn=3) “M” 3
Telnet server
PSH 1:2(1) ack 2
PSH 2:3(1) ack 2
PSH 3:4(1) ack 2
to telnet server
“ESC” (sn=2)
“[” (sn=3)
“^[[”
“M” (sn=4)
“M” 4
PSH 2:5(3) ack 3
ack 4 → OK 1,2,3
sn=“5” → missing 2,3,4 5
PSH 5:6(1) ack 4
ack 2
ack 2 → “^”を期待
“timeout”
“^]]M”
PSH 2:6(4) ack 4
F1 key echo結果 :
7
“^[[M”
ack 6
6
ack 2 以上を受信せず
ack 2以降を再送
ack 6 → OK 2,3,4,5
28
トランスポートレイヤ技術
- TCP; Transmission Control Protocol (1) TCPの基本動作
(2) TCP Interactive Data Flow
(3) TCP Bulk Data Flow
(4) TCP Data Retransmission
(5) TCP Persist Timer
(6) TCP Keep Alive Timer
(7) その他・将来
29
TCP Bulk Data Transmission
- Sliding Window ・ Window制御を用いたパケット転送
①Sliding Window (Receiver設定)
②Congestion Window(Sender設定)
(1) ACKなしにwindow数のパケットを転送
(2) ACKのAggregation(ACKパケットの減少)
(3) Receiver側によるwindow幅の制御
(4) ACK受信でwindowをスライドさせる
30
TCP Sliding Window
Offered window
(advertised by receiver)
Unsent window
1
2
3
4
5
sent and
ACKed
sent but not ACKed
6
7
Can send ASAP
8
9
10
11 …
Can not send until
window slides
31
TCP Sliding Window
Sent “3” and “4”
Offered window
(advertised by receiver)
Unsent window
1
2
3
4
5
6
7
8
sent and ACKed
9
10
Can send ASAP
sent but not ACKed
3+window=9
Receive ack “5”
from receiver
11 …
Can not send
until window
slides
5+window=11
Receive ack “5”
from receiver
TCP Sliding Window
Window advertise by receiver
shrink
enlarge
window
closed by
ACK reception
= ACKed SN
Opend by
ACK reception
(=ack+window)
Slide window by ACK from receiver
33
bsdi.1023
1
3
4
5
6
9
svr4.discard
SYN 0:0(0) win4096 <mss1024>
2
SYN 3:3(0) ack 1 win4096 <mss1024>
ack 4 win4096
PSH 1:1025(1024) ack 4 win4096
PSH 1025:2049(1024) ack 4 win4096
PSH 2049:3073(1024) ack 4 win4096
ack 2049 win4096
7
ack 3073 win3072
8
PSH 3073:4097(1024) ack 4 win4096
ack 4097 win4096
11
PSH 4097:5121(1024)
12 PSH 5121:6145(1024) ack 4 win4096
13 PSH 6145:7169(1024) ack 4 win4096
ack 4 win4096
15
ack 6145 win4096
PSH 7169:8193(1024) ack 4 win4096
ack 8193 win4096
17
10
20
ack 5 win4096
・ Window Shrink ;
“7”: 4096 → 3072
14
(*) aggregate ACK
16
FI 8193:8193(0) ack 4 win4096
ack 8194 win4096
FIN 4:4(0) ack 8194 win4096
・ Window制御
- window = 4096
- mss = 1024
→ 4 segments は、ACK
なしに転送可能。
18
19
- “7” ← “4” & ”5”
- “10” ← “6” & “9”
- “14” ← “11” & “12”
- “16” ← “13” & “15”
34
bsdi.1023
1
3
4
5
6
svr4.discard
SYN 0:0(0) win4096 <mss1024>
SYN 3:3(0) ack 1 win4096 <mss1024>
ack 4 win4096
PSH 1:1025(1024) ack 4 win4096
PSH 1025:2049(1024) ack 4 win4096
PSH 2049:3073(1024) ack 4 win4096
PSH 3073:4097(1024) ack 4 win4096
ack 4097 win 0
ack 4097 win 4096
10
PSH 4097:5121(1024)
11 PSH 5121:6145(1024) ack 4 win4096
12 PSH 6145:7169(1024) ack 4 win4096
ack 4 win4096
13
2
8
9
10
FIN PSH 7169:8193(1024) ack 4 win4096
ack 8193 win 0
ack 8193 win 4096
FIN 4:4(0) ack 8194 win4096
14
15
16
[Fast Sender,Slow Receiver]
・ Window shrink
- “8” : 4096 → 0
- “14” : 4096 → 0
・ Window enlarge
(= window update)
- “9” : 0 → 4096
- “15” : 0→ 4096
(*) segment “13” :
FINのPiggy-Back
17
ack 8193 win4096
35
TCP Congestion Window
Offered window
(advertised by receiver)
Unsent window
1
2
sent and
ACKed
3
4
5
6
7
Shall not send ASAP
Congestion window
(“cwnd”=1 )
8
9
10
11 …
Can not send until
window slides
→ sent but not ACKed
36
TCP Congestion Window
Sent “3”
Offered window
(advertised by receiver)
Unsent window
1
2
3
sent and ACKed
4
5
6
7
8
9
Shall not send ASAP
Shall send without
ACK ASAP;
cwnd=2 (cwnd←cwnd*2)
10
Can not send
until window
slides
3+window=9
4+window=10
Receive ack “4”
from receiver
11 …
Receive ack “4”
from receiver
TCP Congestion Window
・ Slow Start Policy (cwnd ; exponential increase)
cwnd = 1 ;
for (セグメント転送)
{
for (not congestion)
{
if (セグメント転送ACK受信)
{ cwnd = cnwd +1 }
cwnd = 1
}
(*)注意 : Congestion Avoidance では若干異なる。
SenderがLocal に制御することなので、変えることが容易に可能
38
TCP Congestion Window
advertised_window
congestion
cwnd
cwnd
advertised_window
time
< Congestionなしの場合 >
time
< Congestion経験の場合 >
(*) Duplicated ACKを使用せず
39
TCP Congestion Window(1)
1
[受]
[送]
[受]
[送]
1
1
1
1
1
1
1
40
TCP Congestion Window(2)
3
2
[受]
[送]
[受]
[送]
2
3 2
2
3
3
2
2
3
3
2
2
3
41
TCP Congestion Window(3)
4
7
[受]
[送]
6
5
[受]
[送]
4
5 4
7
4
6
5
5
7
4
4
7 6
6
5
5
6
4
4
5
6
7
42
TCP Congestion Window(4)
8
12
[受]
[送]
5
6
[受]
[送]
8
7
9 8
6
7
10 9
11 10 9
13 12
8 9
8
14 13
8
7
11 10
11 10
9
8
9 10
15 14
8
9
12 11
10
13 12
11
43
bsdi.1029
cwnd =1
1
svr4.discard
1:513(512) ack1 win4096
ack 513 win 8192
cwnd =2
3
4
cwnd =3
6
7
cwnd =4
9
10
513:1025(512) ack1 win4096
1025:1537(512) ack1 win4096
ack 1025 win 8192
1537:2049(512) ack1 win4096
ack 1537 win 8192
2561:3073(512) ack1 win4096
8
3037:3585(512) ack1 win4096
13 FIN PSH 3585:4097(512) ack 1 win4096
ack 3073 win 8192
ack 3585 win 8192
ack 4098 win 7680
18
5
2049:2561(512) ack1 win4096
ack 2049 win 8192
ack 2561 win 8192
cwnd =5
cwnd =6
2
FIN 1:1(0) ack 4098 win 8192
ack 2 win 4096
11
12
14
15
16
17
44
TCP Congestion Window
1. Advertised Window by Receiver
2. Congestion Window (cwnd ) defined by sender
(*) ローカルに決めることが可能な値
最適なWindow(Advertised window)サイズ
(1) 廃棄が起こらないくらいの大きさ
→ Congestion Avoidance
(2) 最適パイプライン転送
→ RTT x 帯域幅
トランスポートレイヤ技術
- TCP; Transmission Control Protocol (1) TCPの基本動作
(2) TCP Interactive Data Flow
(3) TCP Bulk Data Flow
(4) TCP Data Retransmission
(5) TCP Persist Timer
(6) TCP Keep Alive Timer
(7) その他・将来
46
TCP Data Retransmission
(1) Expire of Retransmission Timeout
(RTO) Value
- RTO calculation using RTT
- Exponential Back-off (Max. 64 sec.)
(2) Reception of Duplicated ACK
- Fast Retransmission / Fast Recovery
(3) Congestion Window (cwnd ) Control
- Slow Start (exponential increase)
- Congestion Avoidance (liner increase)
47
RTO Expired Retransmission
bsdi.1023
1
3
4
再送間隔
1.5 sec
3 sec
6
7
8
SYN 0:0(0) win4096 <mss1024>
SYN 3:3(0) ack 1 win4096 <mss1024>
ack 4 win4096
PSH 1:15(14) ack 4 win4096
ack 15 win 4096
PSH 15:23(8) ack 4 win4096
PSH 15:23(8) ack 4 win4096
PSH 15:23(8) ack 4 win4096
6 sec
9
17
PSH 15:23(8) ack 4 win4096
PSH 15:23(8) ack 4 win4096
64 sec
18
svr4.discard
2
再送トライ (RTO; 再送タイマ)
RTO = 1.5 sec /* 変更可能*/
5
for ( 9 minutes)
{
if ( RTO expired)
{
retransmission;
RTO=RTO x 2;
RTO=min{64sec, RTO};
}
}
end /* 諦める */
PSH 15:23(8) ack 4 win4096
48
RTO Expired Retransmission
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
0.0
0.0048
0.0064
6.1022
6.2594
24.4801
25.4937
28.4937
34.4937
46.4844
70.4851
118.4864
182.4881
246.4899
310.4910
374.4934
438.4951
:
( 0.0048)
( 0.0016)
( 6.0958)
( 0.1571)
(18.2207)
( 1.0136)
( 3.0001)
( 6.0002)
(11.9905)
(24.0007)
(48.0013)
(64.0018)
(64.0018)
(63.9917)
(64.0018)
(64.0015)
bsdi.1029 > svr4.discard: S 1:1(0)
win 4096 <mss 1024>
svr4.discard > bsdi.1029: S 4:4(0) ack 2
bsdi.1029 > svr4.discard: . ack 5
bsdi.1029 > svr4.discard: P 1:15(14) ack
svr4.discard > bsdi.1029: . ack 15 win
bsdi.1029 > svr4.discard: P 15:23(8) ack
bsdi.1029 > svr4.discard: P 15:23(8) ack
bsdi.1029 > svr4.discard: P 15:23(8) ack
bsdi.1029 > svr4.discard: P 15:23(8) ack
bsdi.1029 > svr4.discard: P 15:23(8) ack
bsdi.1029 > svr4.disacrd: P 15:23(8) ack
bsdi.1029 > svr4.discard: P 15:23(8) ack
bsdi.1029 > svr4.discard: P 15:23(8) ack
bsdi.1029 > svr4.discard: P 15:28(8) ack
bsdi.1029 > svr4.discard: P 15:23(8) ack
bsdi.1029 > svr4.discard: P 15:23(8) ack
bsdi.1029 > svr4.discard: P 15:23(8) ack
:
:
:
49
5
5
5
5
5
5
5
5
5
5
5
5
5
RTO Expired Retransmission
- RTO Exponential Back-Off -
RTO
64 sec
48 sec
32 sec
面積(S)
16 sec
0
1
2
3
4
5
6
7
8
9
If (S ≦ 9 minutes)
{
continue;
}
else
{
abort;
}
10 11
Retransmission回数
50
Timeout and Retransmission
RTO (Retransmission TimeOut)
(1) 古い計算方法 ; RTTの変動に弱い
- RTO = 2 x RTT
where RTT = α・RTTp + (1-α)・RTTM
= 0.9xRTTp + 0.1xRTTM
(2) 最近の計算方法 ; 平均偏差の利用
- RTO = 平均RTT + 4 x 平均偏差
51
Retransmission by Duplicated ACK
(2) Reception of Duplicated ACK
- Fast Retransmission / Fast Recovery
Segment廃棄特性 ;
→ “single (or few) segment(S)” あるい
は連続多数。
→ 未ACKの同一ACK Segmentsを
複数(3回)受信したら、再送。
52
Fast Retransmission by Duplicated ACK
6401:6657(256) ack1
6657:6913(256) ack1
6913:7169(256) ack1
7169:7425(256) ack1
ack 5889
ack 6145
ack 6401
ack 6657
7425:7681(256) ack1
ack 6657 ①
7681:7937(256) ack1
ack 6657 ②
7937:8193(256) ack1
ack 6657 ③
8193:8449(256) ack1
6657:6913(256) ack1
“Fast Retransmission”
ack 6657
ack 6657
ack 6657
8449:8705(256) ack1
8705:8961(256) ack1
ack 8449 win5888
8961:9217(256) ack1
ack 8705 win5888
Congestion Window Control
[目的]
cwndの大きな振動を防ぎ、
適切なcwndで運用する
[1] cwndの制御
(i) ssthresh以下のcwndサイズ
→ Exponential increase
(slow start)
(ii) ssthresh以上のcwndサイズ
→ Liner increase
(congestion avoidance)
[2] ssthreshの制御
(i) Timeout ; goto “1”
(ii) Duplicated-ACK ; 1/2
cwnd=1; ssthresh=65KB;
for ()
{
if (“Timeout”)
{ cwnd=1;
ssthresh = cwnd/2; }
if (“duplicated ACK”)
{ ssthresh=cwnd / 2;
cwnd=ssthresh;
}
if (cwnd ≦ ssthresh)
{ slow_start;
/* exponential */
}
else
{ congestion_avoidance;
/* liner */
}
}
54
Congestion Window Control (続)
・ ICMP 制御メッセージ
(1) ICMP Source Quench
→ cwnd = 1 ;
ssthresh = as is ;
(2) Host unreachable
→ No Action ;
55
Congestion Congestion
slow-start avoidance avoidance
slow-start
Congestion
avoidance
cwdn_1
cwdn_3
Timeout
“ssthresh”
Fast
Recovery
(cwnd_3) / 2
(cwnd_1) / 2
cwnd
Target
cnwd
Fast
Recovery
56
トランスポートレイヤ技術
- TCP; Transmission Control Protocol (1) TCPの基本動作
(2) TCP Interactive Data Flow
(3) TCP Bulk Data Flow
(4) TCP Data Retransmission
(5) TCP Persist Timer
(6) TCP Keep Alive Timer
(7) その他・将来
57
TCP Persist Timer
[機能]
Advertised window size = “0” でも
1 Byte のデータは送信できるように
する。
[目的]
window=“0”時の デッドロックの回避。
受信側プロセスの生存確認
[Timer値制御]
Exponential Back-off (Max. 60秒)
58
TCP Persist Timer
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
( 0.1920)
( 0.0050)
( 0.0034)
( 0.0072)
( 0.0052)
( 0.0034)
( 0.0039)
( 0.0079)
( 0.0051)
( 0.0040)
( 0.0039)
( 0.1612)
( 4.9494)
( 0.0040)
( 4.9961)
( 0.0040)
( 5.9962)
( 0.0040)
(11.9964)
( 0.0040)
(23.9967)
( 0.0040)
bsdi.1027
svr4.5555
bsdi.1027
bsdi.1027
svr4.5555
bsdi.1027
bsdi.1027
bsdi.1027
svr4.5555
bsdi.1027
bsdi.1027
bsdi.1027
svr4.5555
bsdi.1027
svr4.5555
bsdi.1027
svr4.5555
bsdi.1027
svr4.5555
bsdi.1027
svr4.5555
bsdi.1027
svr4.5555
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
svr4.5555:
bsdi.1027:
svr4.5555:
svr4.5555:
bsdi.1027:
svr4.5555:
svr4.5555:
svr4.5555:
bsdi.1027:
svr4.5555:
svr4.5555:
svr4.5555:
bsdi.1027:
svr4.5555:
bsdi.1027:
svr4.5555:
bsdi.1027:
svr4.5555:
bsdi.1027:
svr4.5555:
bsdi.1027:
svr4.5555:
bsdi.1027:
P
.
.
.
.
.
P
P
.
P
P
P
.
.
.
.
.
.
.
.
.
.
.
1:1025(1024) ack 1 win 4906
ack 1025 win 4906
1025:2049(1024) ack 1 win 4096
2049:3073(1024) ack 1 win 4096
ack 3073 win 4096
3073:4097(1024) ack 1 win 4096
4097:5121(1024) ack 1 win 4096
5121:6145(1024) ack 1 win 4096
ack 5121 win 4096
6145:7169(1024) ack 1 win 4096
7169:8193(1024) ack 1 win 4096
8193:9217(1027) ack 1 win 4096
ack 9217 win 0
9217:9218(1) ack 1 win 4096
ack 9217 win 0
9218:9219(1) ack 1 win 4096
ack 9218 win 0
9219:9220(1) ack 1 win 4096
ack 9219 win 0
9220:9221(1) ack 1 win 4096
ack 9220 win 0
9221:9222(1) ack 1 win 4096
59
ack 9221 win 0
トランスポートレイヤ技術
- TCP; Transmission Control Protocol (1) TCPの基本動作
(2) TCP Interactive Data Flow
(3) TCP Bulk Data Flow
(4) TCP Data Retransmission
(5) TCP Persist Timer
(6) TCP Keep Alive Timer
(7) その他・将来
60
Keep Alive Timer
[目的] ; プロセスダウンへの対応
(*) TCPコネクションの終了条件
(i) 明示的終了 (FIN segment)
(ii) プロセスのダウン
[機能] ; TCPコネクションの生存確認
- 2時間ごとにprobe
- 10回(75秒間隔でprobe送信)応答なし
→ 受信プロセスのダウンと判断。
61
TCP Keepalive Timer
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
bsdi.1055 > svr4.echo: P 1:14(13) ack 1
(
0.0061) svr4.echo > bsdi.1055: P 1:14(13) ack 14
(
0.0087) bsdi.1055 > svr4.echo: . ack 14
(7199.8797) arp who-has svr4 tell bsdi
(
0.0021) arp reply svr4 is-at 0:0:c0:c2:9b:26
(
0.0009) bsdi.1055 > svr4.echo: . ack 14
(
0.0041) svr4.echo > bsdi.1055: . ack 14
(7200.1545) arp who-has svr4 tell bsdi
(
0.0021) arp reply svr4 is-at 0:0:c0:c2:9b:26
(
0.0009) bsdi.1055 > svr4.echo: . ack 14
(
0.0040) svr4.echo > bsdi.1055: . ack 14
(7200.1769) arp who-has svr4 tell bsdi
①
( 75.0021) arp who-has svr4 tell bsdi
②
( 75.0020) arp who-has svr4 tell bsdi
③
( 75.0021) arp who-has svr4 tell bsdi
④
( 75.1123) arp who-has svr4 tell bsdi
⑤
( 75.0021) arp who-has svr4 tell bsdi
⑥
( 75.0020) arp who-has svr4 tell bsdi
⑦
( 74.9920) arp who-has svr4 tell bsdi
⑧
( 75.0018) arp who-has svr4 tell bsdi
⑨
( 75.0021) arp who-has svr4 tell bsdi
⑩
62
トランスポートレイヤ技術
- TCP; Transmission Control Protocol (1) TCPの基本動作
(2) TCP Interactive Data Flow
(3) TCP Bulk Data Flow
(4) TCP Data Retransmission
(5) TCP Persist Timer
(6) TCP Keep Alive Timer
(7) その他・将来
63
トランスポートレイヤ技術
- その他・将来 (1) Silly Window Syndrome
(2) MTU Discovery
(3) Window scaling for long fat-pipe
(4) T/TCP (Transaction TCP)
(5) Rate Control
(6) ECN(Explicit Congestion Notification)
64
Path MTU Discovery
[目的]
経路上でフラグメントされない最大のセグメン
トサイズ(Path MTU)の検索
[方法]
ICMPエラーメッセージの利用
(DF; Don’t Fragment オプション)
[頻度]
10分ごとに再検索
65
1 (0.0
2
3
4
5
6
7
8
9
10
11
12
) solaris.33016 > slip.discard: S 1:1(0) win 8760
<mss 1460> (DF)
(0.1016) slip.discard > solaris.33016: S 1:1(0) ack 1
win 4096 <mss 512>
(0.5290) solaris.33016 > slip.discard: P 1:513(512) ack 1
win 4096 <mss 512>
(0.0038) bsdi > solaris: icmp: slip unreachable - need to frag,
mtu = 296 (DF)
(0.0259) solaris.33016 > slip.discard: F 513:513(0) ack 1
win 9216 (DF)
(0.0923) slip.discard > solaris.33016: . ack 1 win 4096
(0.3577) solaris.33016 > slip.discard: P 1:257(256) ack 1
win 9216 (DF)
(0.3290) slip.doscard > solaris.33016: . ack 257 win 3840
(0.3308) solaris.33016 > slip.discard: FP 257:513(256) ack 1
win 9216 (DF)
(0.3208) slip.discard > solaris.33016: . ack win 3840
(0.0422) slip.discard > solaris.33016: F 1:1(0) ack 514 win4096
(0.1719) slip.discard > splaris.33016: . ack 2 win 9216 (DF)
<mss 1460>
solaris
<mss 296>
bsdi
512 B (DF)
slip
ICMP “too big (mss 296)”
66
Window Scaling for Long Fat Pipe
- RFC1323 Network
Ethernet
T1(大陸間)
T1(衛星)
T3(大陸間)
OC12(大陸間)
Bandwidth(bps)
10.000 M
1.544 M
1,544 M
45,000 M
2,400,000 M
RTT(ms)
3
60
500
60
60
BWxRTT(B)
3,750
11,580
96,500
337,500
7,500,000
・ Max. Window Size ; 2^(16) Bytes = 64KB
→ Window Scaling ; “wscale”
wscale=n → 64 x 2^(n) windowサイズ
67
Window Scaling for Long Fat Pipe
1
2
3
4
5
6
7
8
9
10
11
( 0.0031)
( 0.2972)
(16.6198)
( 0.0030)
( 0.2971)
( 9.4202)
( 0.0024)
( 0.0013)
( 0.2363)
(17.5200)
12 ( 0.0031)
13 ( 0.2967)
vangogh.4107 > bsdi.echo: S 1:1(0) win 65535
<mss 512, nop, wscale 1, nop, nop, timestamp, 995351>
bsdi.echo > vangogh.4107: S 1:1(0) ack 1 win 4906 <mss 512>
vangogh.4107 > bsdi.echo: . ack 1 win 65535
vangogh.4107 > bsdi.echo: P 1:14(13) ack 1 win 65535
bsdi.echo > vangogh.4107: P 1:14(13) ack 14 win 4096
vangogh.4107 > bsdi.echo: . ack 14 win 65535
vangogh.4107 > bsdi.echo: F 14:14(0) ack 14 win 65535
bsdi.echo > vangogh.4107: . ack 15 win 4096
bsdi.echo > vangogh.4107: F 14:14(0) ack 15 win 4096
vangogh.4107 > bsdi.echo: . ack 15 win 65535
vangogh.4107 > bsdi.echo: S 1:1(0) win 65535
<mss 512, nop, wscale 2, nop, nop, timestamp 995440>
bsdi.echo > vangogh.4107: S 1:1(0) ack 1 win 4096 <mss 512>
vangogh.4107 > bsdi.echo: . ack 1 win 65535
nop; no operation
wscale ; window scale
68
RFC 1379 ; T/TCP
- Transaction TCP [目的]
TCPコネクションの確立・開放手続きの
速度アップ
[方法]
・ CC (Connection Count) Option
・ SYNへのPiggy-back ; “half-synchronization”
(1) SYN, Data, FIN, CC
(2) SYN, SYN-ACK, Data, FIN, FIN-ACK,
CC, CC-Echo
(3) FIN-ACK
69
RFC 1379 ; T/TCP
Server
Client
SYN (a)
SYN_ACK(a+1,b)
ACK(b+1)
Server
Client
SYN,Data,FIN,CC
SYN,S-ack,Data,
F,F-ack
FIN-ACK
Data (a+2)
Data_ACK(a+2,b+1)
FIN (m,s)
ACK (m+1)
FIN_ACK (m+1,s)
ACK (s+1)
9 セグメント
→ 3 セグメント
70
TCPレート制御
Destination
Node 1
Source
Node 1
Window
制御
Rate
制御
Line Speed
with Window
Shaped Transmission with Window
71
TCPレート制御
Window帯
平均転送速度
Window
制御
ピーク転送速度
Line Speed
with Window
パケット廃棄
Rate
制御
パケット廃棄
パケット廃棄
Window帯
平均転送速度
ピーク転送速度
Shaped Transmission with Window
RED ECN
RED ECN
RED ECN
72
ECN(Explicit Congestion Notification)制御
TOS for Differentiated Service
- PHB(Per-Hop-Behavior)
- CU(Currently Unused)
=> for ECN(Explicit Congestion Notification) ?
0
1
TOSフィールド:
PHB: 000000
101110
Others
xxxxx0
xxxx11
xxxx01
2 3 4
PHB
5
6
7
CU
DE (Default Service)
EF (Expedited Forwarding)
AF (Assured Forwarding)
Standard Purpose
Experimental Purpose
Experimental Purpose
73
Explicit Congestion Notification (ECN)制御
(8) ECN=11
Congestion Node
(Set ECN bit)
Reduce Speed
(2) ECN=01
Destination
Node 1
(9) ECN=11
(3) ECN=01
(1) ECN=00
(7) ECN=10
(5) ECN=11
(4) ECN=10
Source
Node 1
(6) ECN=11
Reduce Speed
輻輳を未然に防ぐ; 予防的制御
74
Download