MIT OpenCourseWare http://ocw.mit.edu 6.033 Computer System Engineering Spring 2009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. L12: end to end layer Dina Katabi Some slides are from lectures by Nick Mckeown, Ion Stoica, Frans Kaashoek, Hari Balakrishnan, Sam Madden, and Robert Morris End-to-end layer client server presentation Layer stub stub End-to-end layer RPC D Data Header Data RPC Header Network Layer H D H D D H H D D H Link Layer H Network layer provides best effort service • Packets may be: • • • • • Lossed Delayed (jitter) Duplicated Reordered … • Problem: Inconvenient service for applications • Solution: Design protocols for E2E modules • Many protocols/modules possible, depending on requirements This lecture: some E2E properties • At most once • At least once • Exactly once? • Sliding window • Case study: TCP • Tomorrow: Network File System (NFS) At Least Once client server DDaata ta an RTT client server DDaata ta AACCKK X Timeout and Retransmission DDaata ta • Sender persistently sends until it receives an ack • Challenges: • Duplicate ACKs • What value for timer Duplicate ACK problem Client Server RReeq 1 q1 timeout AACCKK RReeq 1 q1 RReeq 2 q2 AACCKK RReeq 3 q3 • Problem: Request 2 is not delivered • violates at-least once delivery Solution: nonce Client N1 timeout N2 Server RReeq N q N11 11 N N K AACCK RReeq N q N11 RReeq N q N22 NN11 K AACCK RReeq N q N22 • Label request and ack with unique identifier that is never re-used Engineering a nonce Client • Use sequence numbers • Challenges: • Wrap around? • Failures? 1 timeout 2 Server RReeq 1 q1 11 K AACCK RReeq 1 q1 RReeq 2 q2 11 K AACCK RReeq 2 q2 Timer value • Fixed is bad. RTT changes depending on congestion • Pick a value that’s too big, wait too long to retransmit a packet • Pick a value too small, generates a duplicate (retransmitted packet). • Adapt the estimate of RTT Æ adaptive timeout Adaptive Timeout: Exponential weighted moving averages • Samples S1, S2, S3, .. • Algorithm • EstimatedRTT = T0 • EstimatedRTT = α S + (1- α) EstimatedRTT • where 0 ≤ α ≤ 1 • What values should one pick for α and T0? • Adaptive timeout is also hard At Most Once Challenges client 1 2 server RReeq 1 q1 OOkk 1 KK 1 C AAC rreeq 1 q1 11 K AACCK Process request 1 Process request 1 • Server shouldn’t process req 1 • Server should send result preferably Idea: remember sequence number client 1 2 server RReeq 1 q1 OOkk 1 KK 1 C AAC rreeq 1 q1 Process request 1 Ok 1 Ok OOkk 1 KK 1 C AAC Resend ACK 1 ACK ACK11 • Server remembers also last few responses Problem: failures client 1 2 server RReeq 1 q1 OOkk 1 KK 1 C AAC rreeq 1 q1 OOkk 1 KK 1 C AAC 0 1 Ok Ok ACK ACK11 0 1 Ok Ok ACK ACK11 • Performed request 1 twice! • How to maintain the last nonce per sender (tombstone)? • Write to non-volatile storage? • Move the problem? (e.g., different port number) • Make probability of mistake small? • How about exactly once? (Need transactions) How fast should the sender sends? Host A Host B DDaata ta11 AACCKK DDaata ta22 • Waiting for acks is too slow • Throughput is one packet/RTT • Say packet is 500 bytes • RTT 100ms • Æ Throughput = 40Kb/s, Awful! • Overlap pkt transmission Send a window of packets Host A Host B Send? • Maybe because the receiver is a slow machine ts 3 pk 5-7 2-4 O K, • Assume the receiver is the bottleneck Idle • Receiver needs to tell the sender when and how much it can send • The window advances once all previous packets are acked Æ too slow Sliding Window Host A Host B • Senders advances the window whenever it receives an ack Æ sliding window Send? ts 3-5 2-4 O pk 3 , K Idle • But what is the right value for the window? The Right Window Size • Assume server is bottleneck • • • • Goal: make idle time on server zero Assume: server rate is B bytes/s Window size = B x RTT Danger: sequence number wrap around • What if network is bottleneck? • Many senders? • Sharing? • Next lecture “Negative” ACK Host A D1 D2 D3 D4 Host B X k- 2 c a N D2 D1 D3 • Minimize reliance on timer • Add sequence numbers to packets • Send a Nack when the receiver finds a hole in the sequence numbers • Difficulties • Reordering • Cannot eliminate acks, because we need to ack the last packet E2E layer in Internet HTTP, RTP, Sun RPC, … Application End-to-End Layer TCP or UDP IP Ethernet, WiFI, ... Transport Network Link The 4-layer Internet model UDP Transmission Control Protocol (TCP) • Connection-oriented • Delivers bytes atmost-once • Bidirectional Host A x,? Syn x ac • ACKs are piggybacked x+1, y+1 Host B yn y s , 1 k x+ D a ta x +1, ac k y+1 y, x+1 TCP header Closing a TCP connection Host A x,y Host B fin x y, x +1 x k ac fin y timed wait timeout closed ack y+ 1 closed (Start) Connect/SYN (step 1 of the 3 way handshake) Closed Listen SYN/SYN+ACK (step 2 of the 3 way handshake) Close Close Listen SYN Received SEND/SYN RST ACK Data exchange occur Established Close/FIN Close/FIN SYN+ACK/ACK (step 3 of the 3 way handshake) FIN/ACK Active close FIN WAIT 1 SYN Sent SYN/SYN+ACK (simultaneous open) FIN/ACK Passive close Closing Close WAIT FIN+ACK/ACK ACK Close/FIN ACK FIN WAIT 2 FIN/ACK (Go back to start) Unusual event TIMED WAIT Last ACK Timeout Closed Client/reciever path ACK Sender/server path Figure by MIT OpenCourseWare.