Chapter 6

advertisement
The Transport
p Service
Chapter
p 6
The Transport Layer
Services Provided to the Upper
pp Layers
y
•
•
•
•
Services Provided to the Upper Layers
Transport Service Primitives
Berkeley Sockets
A E
An
Example
l off Socket
S k t Programming:
P
i
– An Internet File Server
Transport
p Service Primitives
The primitives for a simple transport service.
The network,
network transport,
transport and application layers.
layers
Transport
p Service Primitives (2)
( )
Transport
p Service Primitives ((3))
The nesting of TPDUs,
TPDUs packets,
packets and frames.
frames
A state diagram for a simple connection management scheme.
Transitions labeled in italics are caused by packet arrivals. The
solid lines show the client's state sequence. The dashed lines show
the server's state sequence.
Berkeleyy Sockets
Socket API
a) Creating a socket
int socket(int domain
domain, int type
type, int protocol)
• domain = PF_INET, PF_UNIX
• type = SOCK_STREAM,
SOCK STREAM SOCK_DGRAM
SOCK DGRAM
b) Passive
P i O
Open ((on server))
int bind(int socket, struct sockaddr *addr, int addr_len)
int listen(int socket
socket, int backlog)
int accept(int socket, struct sockaddr *addr, int addr_len)
The socket primitives for TCP.
TCP
Sockets (cont)
(
)
a) Active Open (on client)
int connect(int socket, struct sockaddr *addr,
int addr_len)
_ )
Protocol-to-Protocol Interface
a) Process Model
– avoid context switches
b) Buffer Model
b) Sending/Receiving Messages
int send(int socket, char *msg, int mlen, int flags)
int recv(int socket, char *buf, int blen, int flags)
– avoid data copies
M
Message
Lib
Library
P
Process
M
Model
d l
Strip header
Add header
m
m
abcdefg
abcdefg
b d f
bcopy “
( xyz", hdr , 3);
msgAddHdr(m, hdr, 3);
hdr =msgStripHdr(m,
g
p
( , 3);
);
m
m
defg
xyzabcdefg
(a)
Process-per-Protocol
(b)
Process-per-Message
+ hdr =“abc"
Event Library
M
Message
Lib
Library (cont)
( t)
a) Scheduling Timeouts & Book-keeping activity
a) Fragment message
a) Reassemble messages
b) Operations
m
m1
abcdefg
m
m2
abcd
b d
msgFragment (m, new, 3);
new
efg
f
msgReassemble(new, m1, m2)
Event evSchedule(EvFunc function,
void *argument,
int time)
EvCancelReturn evCancel(Event
(
event)
)
new
defg
+
abc
Socket
P
Programming
i
Example:
Internet File
Server
6-6-1
Client code using sockets.
abcdefg
Socket
Programming
Example:
Internet File
Server (2)
Client code using
sockets.
Elements of Transport
p Protocols
•
•
•
•
•
•
Transport
p Protocol
Addressing
Connection Establishment
Connection Release
Flow Control and Buffering
Multiplexing
Crash Recovery
(a) Environment of the data link layer.
((b)) Environment of the transport
p layer.
y
Addressing
g
TSAPs NSAPs and transport connections.
TSAPs,
connections
Connection Establishment
How a user process in host 1 establishes a connection
with a time-of-day server in host 2.
Connection Establishment (2)
( )
a)
b)
Suppose that in a congested subnet, ACKs hardly ever get back in time and each
p times out and is retransmitted two or three times.
pkt
Nightmare: a user established a connection with a bank, sends msg to the bank to
transfer a large amount of money and then releases the connection. But each pkt
p
and stored in the subnet. After the connection has been released,, all
is duplicated
pkts pop out of the subnet and arrive at the destination in order, asking the bank to
establish a new connection, transfer money (again), and release the connection.
p transaction.Æ How to p
prevent the pproblem
The bank thinks it is a second, indep
of delayed duplicates.
1. throw-away transport address: process server model impossible;
2 give each connection a connection identifier chosen by the initiating party;
2.
after each connection is released, each transport entity could update a table listing
obsolete connections as (peer transport entity, connection IDÆ each transport
must maintain a certain amount of history information; If a machine crashes and
loses its memory ???
3. pkts lifetime can be restricted to a known maximum using (1) restricted subnet
design (2) hop counter (3) timestamping each pktÆ introducing T,
T some small
mulltiple of the true maximum pkt lifetime;
Connection Establishment (3)
( )
(a) TPDUs may not enter the forbidden region.
(b) The resynchronization problem.
problem
Connection Establishment (2)
( )
Each host with a time-of-day clock; Basic Idea: two identically numbered TPDUs
are never outstanding
t t di att the
th same time.
ti
When
Wh a connection
ti is
i sett up, the
th lowl
order k bits of the clock are used as the initial seq. #. (next slide’s figure)
(a) Let T=60 sec. the initial seq. # for a connection opened at time x will be x.
I
Imaging
i that
h at t=30sec,
30
an ordinary
di
data
d TPDU bbeing
i sent on connection
i 5 is
i
given seq. # 80. Call this TPDU X. After sending TPDU X, the host crashes and
then quickly restarts. At t=60, it begins reopening connections 0 ~ 4. At t=70, it
reopens connection
ti 55, using
i iinitial
iti l seq. # 70.
70 Assume
A
that
th t att t=85
t 85 a new TPDU
with seq. # 80. Unfortunately, TPDU X still exists. If it arrives before the new
TPDU 80. TPDU X will be accepted. Æ preventing seq. #’s from being used for a
time T before their potential use as initial seq.
seq ##’ss. Æ forbidden region
region. However
However,
two problems exists;
(1) If a host sends too much data too fast on a newly-opened connection, the
actual
t l seq. # versus ti
time curve may rise
i more steeply
t l than
th the
th initial
i iti l seq. #
versus time curve. Æ favor a short clock tick.
(2) as (b) seq.# will run out and actual seq.# will go into the forbidden region; Æ
j before
just
b f
sending
di every TPDU,
TPDU the
h transport entity
i must check
h k to see if it
i is
i
about to enter the forbidden region, if so, either delay the TPDU for T sec or
resynchronize the seq. #.
Connection Establishment (3)
( )
Three p
protocol scenarios for establishingg a connection usingg a
three-way handshake. CR denotes CONNECTION REQUEST.
(a) Normal operation,
(b) Old CONNECTION REQUEST appearing out of nowhere.
(c) Duplicate CONNECTION REQUEST and duplicate ACK.
Connection Release
Abrupt disconnection with loss of data.
CR: Connection Request;
DR: Disconnection Request;
Connection Release (3)
( )
6-14, a, b
Four protocol scenarios for releasing a connection.
connection (a) Normal case of a
three-way handshake. (b) final ACK lost.
Connection Release (2)
( )
The two
two-army
army problem.
problem
Connection Release (4)
( )
6-14, c,d
(c) Response lost
lost. (d) Response lost and subsequent DRs lost
lost.
Flow Control and Buffering
g
((a)) Chained
Ch i d fi
fixed-size
d i buffers;
b ff
wastedd if unequall TPDU (b)
Chained variable-sized buffers. (c) One large circular buffer per
connection; poor if some connections are lightly loaded;
Multiplexing
p
g
(a) Upward multiplexing
multiplexing. (b) Downward multiplexing
multiplexing.
Flow Control and Bufferingg ((2))
Dynamic buffer allocation.
allocation The arrows show the direction of
transmission. An ellipsis (…) indicates a lost TPDU.
Crash Recoveryy
Different combinations of client and server strategy.
A:ack; C:crash; W:write;S1: one TPDU outstanding; S0: no TPDUs
outstanding
A Simple
p Transport
p Protocol
The Example
p Transport
p Entityy
• The Example Service Primitives
• The E
Example
ample Transport Entity
Entit
• The Example as a Finite State Machine
The network layer packets used in our example.
example
The Example
p Transport
p Entityy (2)
( )
Each connection is in one of seven states:
1. Idle – Connection not established yet.
2
2.
W iti – CONNECT hhas bbeen executed,
Waiting
t d CALL REQUEST sent.
t
3. Queued – A CALL REQUEST has arrived; no LISTEN yet.
4
4.
E t bli h d – The
Established
Th connection
ti has
h been
b
established.
t bli h d
5. Sending – The user is waiting for permission to send a packet.
6
6.
R i i – A RECEIVE has
Receiving
h been
b
done.
d
7. DISCONNECTING – a DISCONNECT has been done locally.
Trace the program in page 518 by yourself.
The Example
p as a Finite State Machine
The example
Th
l protocol
t l as a
finite state machine. Each
entry has an optional predicate
predicate,
an optional action, and the
new state. The tilde(~)
indicates that no major action
is taken. An overbar above a
predicate
di t indicate
i di t the
th negation
ti
of the predicate. Blank entries
correspond to impossible or
invalid events.
The Example
p as a Finite State Machine (2)
( )
The Internet Transport Protocols: UDP
• Introduction to UDP
• Remote Procedure Call
• The Real-Time Transport Protocol
Th example
The
l protocoll in
i graphical
hi l form.
f
Transitions
T
ii
that
h leave
l
the connection state unchanged have been omitted for simplicity.
Introduction to UDP
Remote Procedure Call
The UDP header.
header
Steps in making a remote procedure call.
call The stubs are shaded.
shaded
The Real-Time Transport
p Protocol
(a) The position of RTP in the protocol stack.
stack (b) Packet nesting.
nesting
The Internet Transport
p Protocols: TCP
•
•
•
•
•
•
•
•
•
•
•
•
Introduction to TCP
Th TCP Service
The
S i Model
M d l
The TCP Protocol
Th TCP Segment
The
S
t Header
H d
TCP Connection Establishment
TCP C
Connection
ti R
Release
l
TCP Connection Management Modeling
TCP Transmission
T
i i Policy
P li
TCP Congestion Control
TCP Timer
Ti
Management
M
t
Wireless TCP and UDP
Transactional TCP
The Real-Time Transport
p Protocol (2)
( )
Ver. =2; P: padded to a multiple of 4 bytes and the last padding byte tells # of
padding; X:an extension header is present; CC: # of contributing sources; M: an
application-specific marker, e.g., start of a video frame; Payload type: encoding
algorithm, e.g. MP3; Seq. #: for detecting packet loss; timestamp: for jitter
g; Sync.
y Source ID: which stream the pkt
p belongs
g to;; Contr. Source id:
reducing;
Mixers are present, the mixer is the synchronizing source, and the streams being
mixed are listed here;
The TCP Service Model
Port
21
23
25
69
79
80
110
119
Protocol
FTP
Telnet
SMTP
TFTP
Finger
HTTP
POP 3
POP-3
NNTP
Use
File transfer
Remote login
E-mail
Trivial File Transfer Protocol
Lookup info about a user
World Wide Web
R
Remote
t e-mail
il access
USENET news
Some assigned ports
ports.
The TCP Service Model (2)
( )
The TCP Segment
g
Header
(a) Four 512-byte segments sent as separate IP datagrams.
(b) The
Th 2048 bbytes
t off ddata
t ddelivered
li
d tto th
the application
li ti in
i a single
i l
READ CALL.
TCP Header.
The TCP Segment
g
Header
a)
b)
c)
d)
e)
f)
g)
h)
The TCP Segment
g
Header
Source port and Destination port: well-known ports are defined at
www.iana.org;
www
iana org;
URG: urgent pointer is in use; urgent pointer indicates a byte offset
from the current seq.
seq #.
#
ACK: Acknowledgement # is valid.
PSH: Push data; receiver must deliver the data to the application
and not buffer it until a full buffer.
RST: reset a connection;
SYN: used to establish connections;
FIN: release a connection;
Window size: flow control with a variable-sized sliding window;
The pseudoheader included in the TCP checksum.
checksum (1) detect
misdelivered pkts (2) violate the protocol hierarchy
TCP Connection Establishment
TCP Connection Management
g
Modeling
g
6-31
(a) TCP connection establishment in the normal case.
(b) Call
C ll collision;
lli i
only
l one socket
k t connection
ti identified
id tifi d by
b (x,y);
( )
TCP Connection Management Modeling (2)
The states used in the TCP connection management finite state machine.
machine
TCP Transmission Policyy
TCP connection
management finite state
machine. The heavy solid
line is the normal path for a
client. The heavy dashed
line is the normal path for a
server. The light lines are
unusual events. Each
transition is labeled by the
event causing it and the
action resulting from it,
separated by a slash.
(1) Receiver has a 4K-byte buffer;
TCP Transmission Policy
y ((2))
Silly window syndrome.
TCP Congestion
g
Control ((2))
An example of the Internet congestion algorithm.
TCP Congestion
g
Control
(a) A fast network feeding a low capacity receiver.
(b) A slow network feeding a high-capacity receiver.
TCP Timer Management
g
(a) Probability density of ACK arrival times in the data link layer.
(b) Probability density of ACK arrival times for TCP. If timeout is too
short (T1), unnecessary retransmission will occur.
Timeout Setting
g
a)
b)
c)
d)
e)
Wireless TCP and UDP
RTT αRTT (1 α)M;
RTT=αRTT+(1-α)M;
M: a new measurement of RTT;
typically α
α=7/8;
7/8;
Timeout = β RTT; typically β=2; however, inflexible;
Most TCP implementation; D
D=αD+(1-α)|RTT-M|;
αD+(1-α)|RTT-M|; Timeout = RTT
+ 4D;
Splitting a TCP connection into two connections.
connections
Transactional TCP
Performance Issues
•
•
•
•
•
(a) RPC using normal TPC.
(b) RPC using T/TCP.
Performance
f
Problems
bl
in
i Computer
C
Networks
k
Network Performance Measurement
System Design for Better Performance
Fast TPDU Processing
Protocols for Gigabit Networks
Performance Problems in Computer Networks
Network Performance Measurement
The basic loop for improving network performance.
1. Measure relevant network parameters, performance.
2 Try to understand what is going on.
2.
on
3. Change one parameter.
The state of transmitting one megabit from San Diego to Boston
(a) At t = 0, (b) After 500 μsec, (c) After 20 msec, (d) after 40 msec.
System
y
Design
g for Better Performance
System
y
Design
g for Better Performance (2)
( )
Rules:
1. CPU speed is more important than network speed.
2 Reduce
2.
R d
packet
k count to reduce
d
software
f
overhead.
h d
3. Minimize context switches.
4. Minimize copying.
5 You can buy more bandwidth but not lower delay
5.
delay.
6. Avoiding congestion is better than recovering from it.
7 Avoid
7.
A id ti
timeouts.
t
Response as a function of load.
load
System
y
Design
g for Better Performance (3)
( )
Four context switches to handle one packet
with a user-space network manager.
Protocols for Gigabit
g
Networks
Time to transfer and acknowledge a 11-megabit
megabit file over a 4000
4000-km
km line.
line
Download